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!  WOSR-W  S5  0089 

«  « 


This  report  on  the  algebraic  theory  of  structured  singular  values  wraps  up  an  era  of  mathemat¬ 
ical  research  on  p.  sponsored  by  Ae  Air  Force  Office  of  Scientific  Research  (AFOSR).  During 
Ae  last  thirteen  years,  Ae  structured  singular  value  has  developed  from  a  fledgling  concept  to 
a  household  word  in  Ae  aerospace-control  community.  Today  we  have  commercially  sup¬ 
ported  software  enabling  control  designers  to  use  p-tools  m  practical,  full-sr''’“  ipplications. 
Though  Aere  remain  unanswered  Aeoretical  questions,  from  an  aerospace-control 
practitioner’s  pomt  of  view  it  is  time  to  declare  Ae  problem  solved. 

Still,  as  a  maAematician,  I  find  it  difficult  to  walk  away  from  a  problem  as  mteresting  and 
challenging  as  Ae  structured  smgular  value,  especially  when  Aere  remain  basic,  unanswered 
questions.  Most  practical  problems  we  face  can  be  solved  by  using  Ae  upper-bound  estimate 
(see  [1])  for  p,  but  this  bound  gives  no  information  about  Ae  worst-case  parameter  sets. 

The  algebraic  Aeory  presented  in  Ais  report  is  Ae  product  of  my  efforts,  over  Ae  last  five 
years,  to  compute  Ae  value  of  Ae  p-function  exactly  (for  a  special  structure)  and  to  construct 
worst-case  parameter  sets.  Happily,  I  seem  to  have  made  some  real  progress  in  Ais  direction, 
Aough  Aere  remains  a  troublesome  gap  in  Ae  Aeory  for  general  values  of  N.  At  least,  it 
seems  Aat  Ae  case  N  =  4  can  been  solved  by  Ae  Aalytical  meAod  presented  here. 

While  Ae  Aeory  was  being  developed  for  practical  applications,  I  have  always  felt  Aat  struc¬ 
tured  singular  values  should  be  of  interest  to  Ae  Aeoretical  maAematical  community,  Aough 
I  have  seen  little  evidence  of  such  mterest.  From  a  fundamental  point  of  view,  Ae  structured 
singular  value  is  a  natural  generalization  of  Ae  operator-Aeoretic  concepts  of  norm  and  spec¬ 
tral  raAus,  measuring  Ae  size  of  a  linear  operator.  In  addition,  as  is  shown  m  Ais  report,  Ae 
computation  of  structured  smgular  values  leads  directly  to  computational  problems  in  intersec¬ 
tion  Aeory  and  mvariant  Aeory.  Surely  Aere  are  more  general  uses  for  such  a  natural  and 
interesting  concept,  going  beyond  Ae  control-Aeoretic  applications  Aat  Ae  aerospace  com¬ 
munity  has  found  for  it. 

This  report  is  written  primarily  for  maAematicians  —  to  explain  a  little  about  Ae  practical 
control  applications  and  to  describe  Ae  status  of  Ae  algebraic  Aeory.  Industry  could  benefit 
from  furAer  progress  in  Ais  area,  especially  if  significant  simplification  in  Ae  computational 
approach  could  be  found.  I  hope  Aat  someone  wiA  Ae  right  blend  of  interest,  energy  and 
tdent  will  choose  Ais  Aeory  as  an  object  of  study  and  improve  on  Ae  results  presented  here. 

I  would  like  to  Aank  John  Doyle,  Allen  Tannenbaum,  Dave  Morrison,  Joel  Roberts  and  my 
colleagues  at  Honeywell  (especially  Mike  Elgersma)  for  numerous  helpful  Ascussions  during 
Ae  development  of  Ae  Aeory.  Thanks  also  to  Marc  Jacobs  at  AFOSR  who  made  it  possible 
for  me  to  spend  some  of  my  time  working  on  Ais  research  topic. 


Blaise  Morton 
9  January  1995 
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1  Introduction 


The  subject  of  this  report  is  a  special  family  of  functions  defined  on  the  set  of 
square  matrices  with  complex  entries.  Each  of  these  functions  measures  the 
size  of  the  matrix,  according  to  some  criterion.  The  operator  norm  ||  M  ||  of 
the  matrix  M  defined  by 


II  M  l|=max{||  Mu  II  I  II  u  11=  1}  (1) 

is  a  special  example  in  the  family  of  functions  we  will  be  working  with. 

The  functions  of  interest  are  now  defined.  Let  M{N)  denote  the  set  of 
N  X  N  complex  matrices,  and  let  M  €  M(N).  Let  J  =  be  a 

partition  of  N,  that  is 


(2) 

A:=l 

Let  'Dj  denote  the  set  of  block  diagonal  matrices: 


Vj  = 


A(zi)  0 
0  A(i2) 


0 

0 


I  A(ii)  G  M{ii) 


(3) 


A(i„)  J 


Let  'Dj{6)  denote  the  set  of  all  A  €  T>j  of  operator  norm  less  than  or  equal  to 
6.  Suppose  for  some  value  of  S  there  is  a  A  G  ^j(^)  such  that  Det{I+MA)  = 
0,  where  /  G  M.{N)  is  the  identity  matrix.  In  that  case,  denote  by  So  the 
minimum  such  S-,  note  that  So  >  0.  John  Doyle  [1]  defined  the  function 
Mj(M)  : 

fij(M)  =  l/So  (4) 

If  Det{I  +  MA)  ^  0  for  all  A  €  Vj  then  fJij{M)  =  0. 

The  construction  above  defines  a  function  fij  for  each  positive  integer  N 
and  each  partition  J  of  N.  When  N  and  J  are  fixed  one  writes  /x(M)  to 
denote  fij{M).  The  function  fi{M)  is  called  the  structured  singular  value  of 
M. 

When  discussing  the  structured  singular  value  for  a  particular  partition  J 
we  often  refer  to  the  problem  of  computing  fij{M)  by  the  number  of  blocks 
on  the  diagonal  of  A,  i.e.  the  cardinality  of  J.  For  example,  if  iV  =  8  and 
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J  =  {1,3,4}  we  would  refer  to  the  computation  of  ii{M)  as  a  three- block 
problem. 

A  more  general  definition  of  (allowing  repeated  blocks  in  A)  and  some 
of  its  fundamental  properties  were  first  presented  in  [1].  The  definition  above 
is  satisfactory  for  our  purposes  here. 

It  is  easy  to  check  that  for  the  single-block  partition,  J  =  {A^},  the 
function  /i  is  simply  the  operator  norm.  As  is  well  known,  the  operator  norm 
of  the  matrix  M  is  computable  by  algebraic  methods.  Form  the  polynomial 
p{r): 

p{r)  =  Det{r‘^I  -  MM*)  (5) 

and  the  operator  norm  of  M,  p{M),  is  the  largest  real  root  of  the  polynomial 
p(r).  The  operator  norm  of  M  is  the  same  thing  as  the  maximum  singu¬ 
lar  value  'J(M).  Numerically  robust  software  for  computing  a{M)  (and  all 
the  other  singular  values)  has  been  commercially  available  for  nearly  twenty 
years,  predating  the  definition  of  the  structured  singular  value.  The  struc¬ 
tured  singular  value  function  p(M)  is  a  generalization  of  the  maximum  sin¬ 
gular  value  function,  as  the  nomenclature  suggests. 

The  problem  we  consider  in  this  report  is  the  computation  of  the  value  of 
p.(M)  for  general  values  of  N  and  the  iV-block  partition  J  =  Our 

objective  is  to  derive  a  polynomial  p{r)  whose  largest  real  root  is  the  value 
p{M).  Such  a  polynomial  is  a  generalization  of  the  one  in  equation  (5). 

1.1  Computation  of  fi 

The  primary  motivation  for  our  research  is  to  compute  the  value  of  p.  While 
much  effort  has  gone  into  computing  bounds  for  the  various  /z-functions, 
there  has  been  relatively  little  progress  toward  computing  p  exactly.  A  re¬ 
lated  problem  of  practical  interest,  which  we  also  address  here,  is  to  determine 
the  worst-case  parameter  sets,  i.e.  specific  matrices  A  G  'Dj{So)  for  which 
Det{I  -1-  M A)  =  0.  In  the  following  we  embark  on  an  algebraic  theory  of 
structured  singular  values,  the  goal  of  which  is  to  solve  these  two  problems 
by  algebraic  methods.  To  minimize  the  complexity  of  the  analysis  we  con¬ 
centrate  on  the  particular  case  in  which  the  matrix  A  is  an  ordinary  diagonal 
matrix.  This  line  of  research,  started  four  yeeirs  ago  [2],  has  evolved  to  a  point 
where  existing  computer  tools  are  adequate  to  perform  the  computations  for 
small  N,  By  these  methods,  for  the  first  time  we  have  been  able  to  compute 
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/X  and  find  worst-case  parameter  sets  for  a  four-parameter  problem  (i.e.  four 
blocks,  with  each  block  a  scalar  parameter).  See  Section  8  for  an  example. 

Unfortunately,  the  approach  leads  to  impractical  computational  algo¬ 
rithms  for  large  N.  We  do  not  know  whether  a  more  efficient  algorithm 
is  possible.  The  current  theory  leads  to  a  conjecture  on  the  growth  in  com¬ 
plexity  as  the  number  of  parameters  increases,  but  we  do  not  have  definite 
results.  See  the  formula  at  the  end  of  Section  4. 

The  //-function,  developed  to  solve  practical  engineering  problems,  moti¬ 
vates  some  interesting  mathematical  theory.  The  theory  is  now  beyond  the 
stage  where  practicing  engineers  are  equipped  to  contribute,  so  it  is  hoped 
that  mathematicians  will  take  over  and  (perhaps)  develop  some  new  theory  to 
answer  the  outstanding  questions.  Of  special  interest  is  the  question  whether 
a  polynomial-time  solution  algorithm  (polynomial  in  N)  can  be  found.  To 
help  motivate  the  problem  for  non-engineers,  a  brief  history  of  /x.  and  its 
engineering  significance  is  presented  in  the  next  introductory  section. 

We  hope  that  mathematicians  and  engineers  alike  will  find  something  of 
value  in  this  presentation. 

1.2  The  Significance  and  History  of  fi 

Modern  control  engineers  approximate  real  systems  with  finite-dimensional 
linear  time-invariant  (FDLTI)  models.  These  models  are  in  the  form  of  an 
inhomogeneous  O.D.E: 


dx 

~  =  Ax +  Bu 

QZ 

(6) 

y  =  Cx  -f  Du 

(7) 

where  x  is  the  state-vector,  u  is  the  control  input  vector,  y  is  the  output 
vector,  and  A,  B,  C,  D  are  constant  matrices. 

The  technique  of  representing  the  system  by  the  matrices  A,  B,  C,  D  is 
convenient  from  a  mathematical  viewpoint,  but  its  limitations  must  be  rec¬ 
ognized. 

First,  during  the  design  phase,  the  parameters  in  these  matrices  cannot 
be  predicted  exactly.  One  often  supposes  a  nominal  system  model,  derived 
from  physical  principles,  but  the  behavior  of  a  real-world  system  will  not 
coincide  exactly  with  its  nominal  model.  To  account  for  this  type  of  un¬ 
certainty,  the  designer  may  think  of  the  model  as  an  unknown  point  in  a 
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specified  multidimensional  neighborhood  of  the  nominal  point  in  the  space 
of  A,  B,  C,  D  matrices. 

Second,  even  after  the  system  is  built,  one  usually  cannot  measure  all  the 
matrix  coefficients  in  a  real- world  situation.  The  frequency  response  (transfer 
function)  of  a  physical  system  is  a  more  practical  thing  to  measure.  For  this 
second  reason  (and  various  other  reasons),  many  practical  engineers  prefer 
frequency-domain  models  of  their  systems.  To  account  for  uncertainty  in 
frequency-domain  models,  engineers  augment  their  nominal  models  in  two 
ways: 

1.  with  exogenous  noise  inputs  assumed  to  lie  in  a  frequency- weighted 
unit  ball  in  the  Hardy  space  H2 

2.  with  perturbative  transfer  functions  assumed  to  lie  in  a  frequency- 
weighted  unit  ball  in  the  Hardy  space  Hoo- 

The  reader  should  be  aware  that  uncertainties  in  both  time-domain  and 
frequency- domain  models  often  play  important  roles  in  the  same  control 
system  design.  The  construction  of  a  perturbation  structure,  to  account 
for  both  types  of  uncertainties  in  a  real-world  system,  is  a  key  part  of  the 
control-engineering  art. 

The  /i-approach  to  control  theory  uses  both  time-domain  and  frequency- 
domain  concepts.  First,  a  multidimensional  box  is  constructed  (mathemat¬ 
ically)  in  the  A,  B,  C,  D  space.  The  assumption  is  made  that  the  system 
model  could  lie  anywhere  within  this  box.  Next,  by  algebraic  manipulations, 
a  parametric  representation  of  the  entire  box-worth  of  systems  is  derived. 
The  associated  parametric  set  of  frequency-domain  models  is  then  augmented 
with  exogenous  inputs  and  perturbative  operators  to  produce  the  perturba¬ 
tion  structure.  Finally,  a  controller  is  found  (if  possible)  that  guarantees 
good  stability  and  performance  properties  for  every  system  model  contained 
in  the  perturbation  structure.  This  process  is  called  robust  control  design. 
There  is  a  substantial  body  of  theory  underlying  this  construction  (see  [3] 
and  the  references  contained  there),  here  we  shall  only  describe  the  basic 
concept  behind  the  frequency-domain  stability  criterion. 

The  researcher  in  practical  control  theory  should  have  a  firm  grasp  of  the 
frequency-domain  theory  and  its  practical  significance.  It  is  no  exaggeration 
to  say  that  the  standard  time-domain  theory,  by  itself,  is  inadequate  for 
practical  applications.  For  the  benefit  of  those  who  want  more  background. 
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the  remainder  of  this  sub-subsection  is  a  quick  introduction  to  the  frequency- 
domain  approach. 

Begin  by  assuming  that  you  are  operating  a  real  physical  device,  with 
knobs  (control  input  u)  and  dials  (measurement  vector  y).  Suppose  the 
physical  system  has  been  given  to  you  at  a  steady-state  condition  -  the 
input  vector,  output  vector  and  internal  system  states  are  all  constant.  This 
assumption  of  steady-state  condition  cannot  be  verified  by  direct  physical 
observation  because  the  notion  of  the  internal  state  is  a  theoretical  construct. 
Even  so,  in  many  practical  system,  if  the  knobs  are  all  fixed  and  the  dials  all 
indicate  constant  outputs,  and  if  the  system  seems  to  be  behaving  properly 
in  all  other  respects,  the  steady-state  assumption  is  made. 

Now  wiggle  one  of  the  control  inputs  by  a  (small)  sinusoidal  signal  and 
measure  the  (small)  variations  of  each  output  signal.  Let  the  perturbing 
input  signal  be  Aksin{u}t)  in  the  input  channel  and  measure  the  additive 
perturbation  on  the  output  signal.  The  output  will  be  perturbed  by 
a  signal  that  looks  close  to  Bj,ksin(u}t  -H  Cj.jt).  The  ratio  Bj^kjAk  is  called 
the  gain  and  the  angle  Cj,*,  is  called  the  phase  shift  of  the  system  from  the 

input  to  the  output  at  the  frequency  w.  The  data  obtainable  in  this 
fashion  can  be  collected  into  a  family  of  complex  matrices  F{ijS)  parametrized 
by  the  frequency  w: 


F{u)  =  [F^,kH]  = 


(8) 


Call  this  matrix  F{u)  the  frequency  response  of  the  linear  system.  Assuming 
a  linear  system  response  to  (small)  perturbations,  an  analytic  expression  for 
F{uj)  can  be  derived  from  the  associated  A,  B,  C,  D  matrices  by  using  the 
Laplace  transform.  In  deriving  such  an  expression,  it  is  customary  in  the 
engineering  literature  to  let  the  Laplace  transform  variable  s  denote  yZ—lu 
and  use  the  argument  s  instead  of  u  for  the  transfer  function  T{s).  The 
transfer  function  T{s)  is  defined  for  all  complex  values  of  the  parameter  s  as 
follows: 

T{s)  =  D  +  C{sl  -  A)-^B  (9) 

For  values  of  s  on  the  imaginary  axis,  s  —  y/^iv,  we  have  T{s)  =  F{ui).  The 
transfer  function  T{s)  is  the  basic  object  of  attention  in  frequency-domain 
methods.  The  system  is  stable  if  and  only  if  all  the  poles  of  the  transfer 
function  lie  in  the  open  left-hand  plane  of  the  complex  s-domain. 
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1.2.1  Frequency-Domain  Uncertainty  -  The  Small  Gain  Theorem 

From  an  analytical  point  of  view,  it  is  essential  to  recognize  that  a  frequency- 
domain  model  (transfer  function)  T(s),  whether  computed  mathematically 
or  measured  experimentally,  is  an  approximation.  The  principal  weaknesses 
of  such  a  model  vary  with  application,  although  all  practical  models  suffer 
from  three  basic  limitations: 

1.  They  vary  as  a  function  of  steady-state  condition 

2.  They  are  accurate  only  for  small  (perturbation)  signal  inputs 

3.  They  are  accurate  only  for  a  bounded  range  of  frequencies 

These  modeling  limitations  complicate  the  analysis  of  real-world  systems 
having  significant  nonlinearities,  range  of  operating  point  and  frequency- 
dependent  model  uncertainty.  The  //-theory  was  developed  primarily  to  ad¬ 
dress  systems  in  which  this  last  class  of  problems  is  the  primary  concern.  We 
will  be  addressing  problems  associated  with  frequency-domain  uncertainty  in 
FDLTI  systems. 

One  basic  tenet  of  frequency-domain  uncertainty  is:  model  uncertainty 
tends  to  increase  at  high  frequencies.  The  range  of  frequencies  for  which 
the  model  is  accurate  depends  on  the  physical  properties  of  the  system  ele¬ 
ments.  There  are  many  factors  that  contribute  to  high-frequency  uncertainty 
-  two  important  examples  are  sensor  limitations  and  actuator/power-supply 
limitations.  With  increasing  frequency  it  becomes  increasingly  difficult  and 
expensive  to  produce  sensors  and  actuators  that  work  close  to  any  predictable 
analytical  model.  Because  cost  is  a  vital  factor  in  system  design,  mathemat¬ 
ical  models  are  often  not  valid  at  frequencies  above  the  range  required  for 
practical  system  operation. 

Before  proceeding,  it  is  worth  observing  that  model  uncertainty  is  the 
primary  motivation  for  feedback  control.  Considering  the  issue  abstractly, 
if  our  models  (including  knowledge  of  the  initial  state)  were  perfect,  there 
would  be  no  need  to  consider  adjustment  of  a  control  input  based  on  sensor 
measurements.  The  control  designer  could  include  a  simulation  of  the  perfect 
model  in  his  control  laws  and  use  a  simulated  value  in  place  of  any  physical 
measurement.  Thus,  feedback  control  strategy  depends  in  a  fundamental 
way  on  the  uncertainty  characteristics  of  the  model. 
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Let  us  consider  briefly  some  practical  issues  (there  axe  many)  associated 
with  uncertainty.  First,  if  the  open-loop  system  is  inherently  unstable  (as 
many  aerospace  vehicles  are)  there  is  the  issue  of  robust  stabilization.  There 
have  been  real  system  designs  (poor  ones)  where  the  nominal  closed-loop 
system  is  stable  but  a  small  change  in  model  parameters  produces  an  unstable 
closed-loop  system.  We  want  to  insure  that  our  systems  remain  stable  for  all 
parameter  variations  within  a  specified  range  of  values. 

Second,  even  if  robust  stability  is  not  a  problem,  there  is  the  issue  of 
robust  performance.  Parameter  variations,  large  or  small,  can  influence  the 
performance  of  a  closed-loop  system.  Ideally,  the  closed-loop  system  should 
be  relatively  insensitive  to  variation  of  parameters  within  a  set  of  anticipated 
ranges. 

Third,  whether  the  open-loop  system  is  stable  or  not,  we  are  concerned 
with  the  potential  destabilizing  effects  of  perturbations  to  the  system  state. 
State  perturbations  arise  when  the  outside  world  interacts  with  our  system, 
causing  changes  in  state  not  predicted  by  our  nominal  model.  A  wind  gust 
acting  on  an  airplane  is  a  typical  example. 

Finally,  we  are  concerned  with  uncertainties  in  system  dynamics,  whether 
because  of  internal  dynamics  neglected  in  our  design  models  or  because  of 
subsystem  failures.  Uncertainties  of  all  four  types  are  considered  in  a  typical 
control  design.  We  have  a  limited  ability  to  represent  them  and  to  design 
control  systems  accommodating  them,  but  these  are  the  real  issues  that  drive 
robust  control  design. 

We  now  address  analytic  representation  of  frequency-domain  uncertainty. 
Consider  a  nominal  transfer  function  T(s)  subject  to  uncertainty.  Much  work 
has  been  devoted  to  modeling  various  types  of  uncertainty  (an  early  reference 
is  [4]),  let  us  take  additive  uncertainty  as  one  simple  example.  The  resulting 
structure  will  be  applicable  to  many  other  types  of  uncertainty. 

Suppose  the  nominal  model  T{s)  has  n  inputs  and  m  outputs.  A  feedback 
controller  K{s)  of  general  type  will  have  n  outputs  and  m  +  k  inputs.  The  n 
outputs  of  K{s)  are  identified  with  the  inputs  of  T{s),  the  last  m  inputs  of 
K{s)  are  identified  with  the  outputs  of  T{s),  and  the  first  k  inputs  of  K(s) 
are  identified  with  command  inputs  from  an  external  source  (e.g.  the  pilot  of 
an  airplane).  The  closed-loop  system  now  has  only  k  inputs  (the  externally 
generated  commands)  but  it  still  has  the  same  m  outputs  as  the  original 
open-loop  system  T{s). 
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We  assume  the  transfer  function  T{s)  is  perturbed  additively  by  some 
unknown,  stable  function  W(s)A(s)  where  W(s)  is  a  specified  m  x  m  transfer 
function  (weighting  matrix)  and  A(s)  is  an  unknown  transfer  function  in  the 
unit  ball  (relative  to  the  operator  norm)  of  the  space  of  m  x  n  matrices  with 
Hcx>  entries.  The  open-loop  system  becomes 

Tpert(5)  =  T(s)-|-W(s)A(s)  (10) 

Consider  what  happens  when  the  fixed  controller  K{s)  is  used  to  close  the 
loop  for  Tpert(s).  The  robust  stability  question  is:  is  it  true  that  for  all  A(s) 
in  the  unit  ball  of  Hqo,  the  closed-loop  system  obtained  by  replacing  T{s) 
with  Tpert{s)  is  stable? 

To  answer  the  robust  stability  question  we  first  perform  an  algebraic 
transformation  to  the  problem.  We  assume  that  the  nominal  closed-loop 
system  (A(s)  =  0)  is  stable.  Then  it  is  an  elementary  exercise  to  construct  a 
stable  transfer  function  M{s),  independent  of  A(s),  with  m  +  k  inputs  and 
n-l-m  outputs  with  the  following  property:  when  the  first  n  outputs  of  M{s) 
are  closed  through  A(s)  to  the  first  m  inputs  of  M(s),  the  closed  loop  system 
is  the  same  as  that  obtained  by  closing  the  bottom  loops  of  Tpert{s)  through 
the  last  m  inputs  of  K{s).  Partitioning  M  into  blocks,  we  find  that  the  closed 
loop  transfer  function  has  the  form: 

^„(M,  A)  =  M22  +  M2iA(/  -  Mn  A)-^Mi2  (11) 

From  this  expression  we  see  that  the  closed-loop  system  will  be  stable  for  all 
A  of  norm  less  than  1  if  and  only  if  the  factor  (7  —  Mu  A)”^  is  stable  for  all 
such  A.  Clearly,  if  the  77oo  operator-norm  of  Mii(s)  is  less  than  1  we  can 
conclude  robust  stability.  In  the  case  where  A  has  no  additional  structure, 
this  sufficient  condition  turns  out  to  be  necessary  -  that  is  the  small  gain 
theorem. 

In  those  cases  where  the  uncertainty  is  known  to  have  block-diagonal 
structure,  however,  the  sufficient  condition  ||  Mii(s)  ||<  1  is  no  longer  nec¬ 
essary.  Block-diagonal  conditions  on  A  arise  naturally  in  many  situations; 
for  example,  when  a  collection  of  physically-isolated,  uncertain  systems  Tj{s) 
are  cascaded.  Associated  with  each  Tj{s)  will  be  a  separate  Aj(s),  and  the 
overall  A(5)  for  the  cascaded  system  will  have  block-diagonal  form. 

The  small-gain  theorem  can  be  applied  in  the  case  of  block-diagonal  A, 
but  the  test  is  too  conservative  for  many  practical  applications.  Often,  the 
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designer  is  forced  to  sacrifice  too  much  performance  in  order  to  pass  the 
small-gain  test,  so  a  better  criterion  is  needed.  From  this  need  was  born 
the  structured  singular  value  test,  which  is  evaluated  by  computing  the  fi- 
function. 

In  applications,  the  transfer  function  Mii(s)  is  represented  in  the  com¬ 
puter  via  a  state-space  realization  (A,B,C,D  matrices).  For  each  Sj  = 
in  a  grid  on  the  imaginary  axis  the  transfer  function  Mii(sj)  is  computed. 
The  value  of  iJ,{Mu{sj))  (or  some  bounding  function)  is  then  computed.  For 
example,  the  maximum  singular  value  CT(Afii(sj))  is  the  upper  bound  corre¬ 
sponding  to  the  small  gain  theorem.  The  values  of  the  function  are  plotted 
on  a  log-magnitude  vs.  log-frequency  graph  and  displayed  to  the  design  engi¬ 
neer.  If  it  is  found  that  fi{Mn{sj))  is  less  than  one  for  all  values  on  the  grid, 
robust  stability  is  concluded.  Implicit  in  this  approach  is  the  assumption 
that  the  grid-size  is  fine  enough  to  make  it  apparent  whether  the  fi  function 
exceeds  1  at  any  point  along  the  imaginary  axis. 

If  the  /i-function  is  smaller  than  1  at  all  points  on  the  imaginary  axis, 
robust  stability  follows  from  the  properties  of  continuity  of  roots  of  a  poly¬ 
nomial  equation  (with  respect  to  its  coefficients)  and  the  maximum  modulus 
theorem. 

A  different  approach  is  required  if  one  wants  to  consider  only  real  values 
for  some  of  the  uncertain  parameters,  but  we  shall  be  concerned  with  complex 
parameters  (complex  entries  in  the  blocks  Aj)  in  this  report. 

1.2.2  The  Origin  of  fi 

The  concept  of  the  Structured  Singular  Value  function,  n{M),  is  now  more 
than  a  decade  old.  The  early  developmental  stage  of  the  concept  can  be 
traced  back  to  1977  when  singular  values  were  applied  by  a  group  of  control- 
design  engineers  at  Honeywell’s  Systems  and  Research  Center  to  the  analysis 
of  multivariable  linear  time-invariant  systems  [5].  Their  goal  was  to  find  a 
multiloop  generalization  of  the  famous  small-gain  theorem,  so  useful  in  the 
robust-stability  analysis  of  single-input  single-output  (SISO)  systems.  At 
that  time,  the  accepted  practical  technique  for  evaluating  robust  stability 
of  multi-loop  systems  was  to  open  a  single  loop  at  a  time  and  apply  the 
established  SISO  criteria  (gain  and  phase  margins).  The  Honeywell  group 
recognized  the  inadequacy  of  this  one-loop-at-a-time  approach  and  aimed  at 
a  more  reliable  robust  stability  test.  By  analogy  with  the  small-gain  theorem. 
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the  solution  they  sought  was  an  analytic  tool  for  frequency  domain  analysis. 

Shortly  after  its  theoretical  development,  the  singular  value  approach 
was  applied  to  a  helicopter  flight-control  design  [6].  During  that  study  it 
was  found  that  the  methodology  displayed  some  significant  weaknesses.  The 
difficulty  was  that  representation  of  uncertainty  in  an  unstructured  way  can 
lead  to  an  overly  conservative  robustness  criterion.  In  general,  the  robust 
stability  test  in  the  singulax-value  methodology  was  sufficient  but  far  from 
necessary.  A  control  system  with  adequate  robust  stability  could  fail  the 
test.  This  shortfall  was  the  reverse  of  the  problem  inherent  with  the  single- 
loop-at-a-time  approach,  where  each  loop  might  look  good  individually  but 
the  multiloop  system  as  a  whole  might  lack  robustness. 

The  overconservative  nature  of  the  singular  value  approach  was  easy  to 
understand  but  not  so  easy  to  fix.  A  better  methodology  was  sought:  to 
be  acceptable  it  had  to  be  computable  for  problems  of  realistic  size  and  its 
criteria  for  robust  stability  had  to  be  as  close  to  “necessary  and  sufficient”  as 
possible.  After  several  years,  an  acceptable  solution  was  found  in  the  form 
of  the  Structured  Singular  Value  (SSV). 

In  1981  the  mathematical  theory  of  the  SSV  was  introduced  by  Doyle  in 
the  landmark  paper  [1].  At  about  the  same  time,  an  engineering  application 
paper  [4]  appeared,  showing  how  a  wide  variety  of  practical  robust  stability 
problems  could  be  reduced  to  computing  (or  bounding)  the  SSV  of  a  matrix 
transfer  function,  called  the  perturbation  structure.  From  a  theoretical  point 
of  view  the  SSV  was  a  complete  success:  a  system  is  robustly  stable  if  and 
only  if  the  unperturbed  system  is  stable  and  the  SSV  function  fi{M{s))  of 
its  associated  perturbation  structure  M{s)  is  less  than  1  for  all  values  of 
s  on  the  imaginary  axis.  The  general  MIMO  robust  stability  problem  was 
reduced  to  a  single  class  of  numerical  problems:  given  a  complex  N  x  N 
matrix,  find  a  sharp,  computable  upper  bound  for  n{M).  A  computable 
upper  bound,  which  turned  out  to  be  good  for  many  applications 

(early  examples  provided  in  [7]  and  [8]),  was  provided  by  Doyle  in  [1],  and  a 
powerful  methodology  for  robust  control  design  and  analysis  was  born. 

It  is  worth  emphasizing  that  the  upper-bound  function  fi,  not 
is  the  function  used  in  today’s  /x-methodology.  This  upper  bound  is  the 
solution  of  a  convex  optimization  problem  and  so  is  easily  evaluated  on  a 
computer.  It  represents  a  significant  improvement  on  the  singular  value  test 
(maximum  singular  value  a{M)),  which  is  itself  an  upper  bound  on 
The  robust  control  synthesis  methodology,  /x-synthesis,  is  based  on  a  weighted 
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Hoo  optimization  theory  associated  with  the  upper  bound  p,. 

Since  the  introduction  of  //  [1],  this  concept  has  been  implemented  in  a 
variety  of  computer  tools  to  quantify  the  robust  stability  of  feedback  control 
systems.  Engineers  have  applied  these  tools  successfully  in  the  design  phases 
of  control  systems  for  many  advanced  aerospace  vehicles:  the  B-2  Bomber, 
Space  Station,  and  the  F-15  STOL  Technology  Demonstrator  to  name  just  a 
few.  The  /^-analysis  and  synthesis  tools  are  now  a  standard  part  of  modern 
aerospace  system  design. 

1.3  Structure  of  the  Report 

After  this  introduction  we  move  directly  to  the  algebraic  theory.  In  the 
second  section  we  derive  the  basic  set  of  algebraic  equations,  and  in  the  third 
section  we  show  how  the  elimination  can  be  performed  for  the  cases  of  2,  3 
or  4  parameters.  Some  abstract  theory  associated  with  the  elimination  for 
general  numbers  of  parameters  is  postponed  until  Section  9. 

In  Section  4  we  redirect  attention  to  a  related  algebraic  problem  about 
which  much  is  known.  The  link  between  the  basic  equations  of  Section  2  and 
the  hyperdeterminant  of  a  three-dimensional  matrix  is  shown,  so  that  we  can 
apply  the  known  results  to  our  problem. 

In  Section  5  we  introduce  a  third  algebraic  problem,  also  related  to  the 
basic  equations.  This  approach  was  the  starting  point  of  the  algebraic  theory 
of  [2],  it  has  definite  computational  advantages  in  the  cases  of  two  and  three 
parameters.  The  general  results  for  this  approach  in  the  low-dimensional 
cases  are  described  in  Section  6.  Some  extensions  of  these  results  to  higher 
dimension  are  presented  in  Section  7. 

In  Section  8  we  illustrate  the  techniques  described  in  the  first  seven  sec¬ 
tions  by  computing  ft  and  worst-case  parameter  sets  for  numerical  examples. 
Section  9  is  an  abstract  theoretical  presentation  of  the  general  approach, 
intended  for  more  advanced  researchers.  Section  10  is  a  discussion  and  sum¬ 
mary  of  results  and  outstanding  issues. 
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2  Derivation  of  the  Basic  Equations 

Let  M  be  a  complex,  N  xN  matrix.  As  was  shown  in  (1],  the  structured 
singular  value  /*(A/)  is  given  by: 

sup  (12) 

0eD 

where  p  is  the  spectral  radius  function  and  V  is  the  set  of  real-diagonal  NxN 
matrices. 

The  sup  ftmction  in  equation  12  is  (effectivdy)  taken  over  a  compact  set, 
so  there  is  some  nonzero  AT-vector  z  such  that,  for  some  0, 

€^Mz  =  p{M)z  (13) 

Our  immediate  goal  is  to  determine  a  polynomial  expression  whose  lar¬ 
gest  real  root  is  the  value  For  that  purpose  we  introduce  the  variable 

parameter  r,  and  define  the  system  of  Hermitian  forms  Hk{r): 

Hk{r)  =  MtMk  -  rhlet  (14) 

where  M*  is  the  k*^  row  of  the  matrix  M  and  et  is  the  row  vector  whose 
entry  is  1,  all  others  0. 

Lemma  1  piM)  is  the  largest  real  value  of  r  for  which  there  is  a  nonzero 
N -vector  z  satisfying 

2'.ff*(r)z  =  0  (15) 

for  A  =  1, . . . ,  AT 

Proof  of  Lemma  1:  First  we  show  that  the  conditions  of  the  lemma 
are  satisfied  if  p{M)  is  substituted  for  r.  Select  z^O  satisfying  equation  13. 
Compute  the  squares  of  the  magnitudes  of  the  entries  on  each  side  of 
equation  13: 

.  iMkzy{Mkz)^p{Mfz:zk  (16) 

But  this  set  of  equations  for  fe  =  1,. . . ,  AT  is  equivalent  to  equation  15. 

Conversely,  suppose  To  is  the  largest  real  number  such  that  some  nonzero 
z  satisfies  equation  15.  Then  tq  is  the  largest  real  number  for  which  there 
are  0  €  and  z  ^  0  such  that 
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e^Mz  =  raz  (17) 

That  is,  n{M)  =  tq.  □ 

K  we  think  of  2,  ^  as  a  vector  in  a  2jV-dimensionaI  real  vector  space,  the 
system  of  equations  15  gives  only  N  polynomial  equations  in  the  2N  +  1  real 
variables  z,  7,  r.  Our  goal  is  to  eliminate  z  and  z  in  order  to  obtain  a  single 
polynomial  equation  in  r.  Additional  polynomial  equations  are  needed  for 
the  elimination;  these  are  derived  from  the  condition  that  the  value  of  r  we 
seek  is  extremal. 


Lemma  2  Let  J(r,z,2)  denote  the  N  x  2N  matrix 


J{r,z,z)  = 


•  •  * 


z^a,(r)  ■ 


(18) 


If  (ro,  z,z)  is  a  solution  to  the  system  of  equations  15  and  ro  is  extremal 
among  such  solutions,  then  the  rank  of  J{rQ,z,z)  is  less  than  N. 


Proof  of  Lemma  2:  Consider  the  function  F: 


by: 


F{r,z,z) 


'  z*IIiir)z  ■ 
z'ffff{r)z 


defined 

(19) 


Observe  that  the  N  x  2N  matrix  J(r,  z,  z)  is  the  matrix  of  partial  derivatives 
of  F  with  respect  to  z,  z,  that  is: 


J(r,2,z) 


dz  dz 


(20) 


At  a  point  (r,  z,  z)  where  F  vanishes  and  the  matrix  J  of  partial  derivatives 
has  full  rank  N,  the  implicit  function  theorem  [9]  implies  that,  locally,  the  zero 
set  of  F  can  be  parametrized  smoothly  by  r  and  an  AT-dimensional  subset  of 
(z,  z).  But  then  r  cannot  be  extremal.  This  contradiction  proves  the  lemma. 
□ 


The  previous  two  lemmas  lead  directly  to  a  system  of  polynomial  equa¬ 
tions  in  (r,2,z)  that  must  be  satisfied  at  a  solution  of  the  equations  15  for 
which  r  is  extremal.  We  use  the  symbol  C  to  denote  this  set. 

The  set  C  contains: 
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1.  the  system  of  N  equations  15 


2.  the 


2N 

N 


size-A^  determinantal  minors  from  the  matrix  J{r,z,z). 


It  appears  that,  for  a  general  class  of  matrices  M ,  the  system  C  generates 
a  system  of  polynomials  from  which  the  variables  (z,  z)  can  be  eliminated. 
The  eliminant  of  this  system  is  a  polynomial  p(r)  whose  coefficients  are  poly¬ 
nomials  in  the  coefficients  of  M  and  M.  We  call  this  polynomial  p(r)  the 
r-polynomial.  The  roots  of  the  r- polynomial  are  called  r- values,  and  those 
r- values  that  appear  at  local  maxima  are  called  /i- values.  The  largest  r- value 
is  a  /x-value  which  is  equal  to  the  value  of  the  function  p{M). 

The  procedure  required  to  perform  the  elimination  depends  on  N.  In  the 
following  we  will  show  how  this  elimination  can  be  performed  for  values  of 
N<A. 
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3  Performing  the  Elimination 

We  begin  this  section  with  a  description  of  the  general  elimination  technique 
used  to  derive  the  r-polynomial.  The  details  of  the  computations  in  the  cases 

<  4  are  then  presented  in  subsections. 

Recall  the  set  of  equations  C  defined  in  the  previous  section.  We  will  use 
the  polynomials  in  C  to  generate  a  system  of  polynomials  bihomogeneous  in 
{z,z). 

Definition:  A  polynomial  q{z,z)  is  called  bihomogeneous  of  bidegree 
(i,j)  if  it  is  homogeneous  of  degree  i  when  considered  as  a  function  of  the 
vector  z  and  homogeneous  of  degree  j  when  considered  as  a  function  of  the 
vector  z. 

For  example,  the  expressions  23^2  and  z*Hk{r)z  are  all  bihomoge¬ 
neous  of  bidegree  (1,1). 

For  each  N,  let  PN{i,j)  denote  the  set  of  bihomogeneous  polynomials  of 
bidegree  (ij).  The  set  Pjv(i,j)  forms  a  finite- dimensional  vector  space  over 
the  real  number  field  7^.  Also,  if  pi (2, 2)  €  PN{ii,ji)  &nd  p2{z,z)  6  Psihih) 
then  pi{z,z)p2{z,z)  €  Piv(n  +  +72). 

Our  elimination  approach  makes  use  of  an  elementary  combinatorial  lemma, 
stated  here  without  proof. 

Lemma  1  The  dimension  of  P^ii^j)  is: 
dim{PNii,j))  =  ^ 

The  notion  of  bihomogeneous  polynomials  extends  in  the  obvious  way 
when  the  coefficients  of  the  polynomials  lie  in  a  general  ring.  Considered 
over  the  ring  of  real-polynomials  in  the  variable  r,  all  of  the  polynomials  in  C 
are  bihomogeneous.  Those  in  equation  (15)  are  of  bidegree  (1,1),  while  each 
of  the  size-N  minors  of  the  the  matrix  J{r,z,z)  has  bidegree  (ij)  for  a  pair 
of  non-negative  integers  i,j  such  that  i+  j  =  N. 

Our  strategy  for  obtaining  the  r-polynomial  is  a  follows.  Pick  a  pair  of 
positive  integers  ijjjr  with  both  and  jx  sufficiently  large  (depending  on 
N).  Over  the  ring  of  real-polynomials  in  r,  consider  the  space  of  bihomo¬ 
geneous  polynomials  PAr(*r?  jr)-  Now  each  polynomial  q  in  C  is  bihomoge¬ 
neous,  let  hidegree{q)  =  {iq,jq)’  If  *T  ^  *5  and  jj  >  jq,  q  can  be  multiplied 
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by  any  polynomial  h  G  PNih  —  ig,jT  —  jq)  (considered  over  %)  to  obtain 
qh  G  PNiiTijr)'  In  this  way  the  original  set  of  polynomials  in  C  can  be  used 
to  generate  a  larger  system  of  polynomials  in  the  module  PN{iT,jT)  over  the 
ring  of  polynomials  in  r.  Each  polynomial  generated  in  this  way  must  vanish 
on  the  set  of  points  we  seek.  We  can  represent  the  entire  set  of  equations  in 
■Piv(*T>yr)  as  a  single  matrix  equation: 

A(r)$(z,2)  =  0  (22) 

where  A(r)  is  a  matrix  of  real  polynomials  in  r  and  $(z,  z)  is  a  vector  of  basis 
monomials  of  the  vector  space  PN{iT,jT)  over  TZ. 

The  condition  that  remains  to  be  verified  is  that,  for  a  generic  value 
of  r,  the  rank  of  the  matrix  A(r)  is  equal  to  the  dimension  of  Pjv(ir>ir)" 
Under  this  condition  the  r-polynomial  p(r)  is  nontrivial  and  theoretically 
well  defined  -  it  can  be  obtained  (in  principle)  by  computing  the  greatest 
common  divisor  of  the  maximal  minors  of  the  matrix  A(r). 

The  procedure  is  illustrated  in  the  examples  below.  As  will  be  seen,  once 
the  roots  of  the  r-polynomial  have  been  found  we  can  recover  the  z- vector  as 
well. 


3.1  The  Case  N  =  2 

First  we  determine  the  polynomials  in  C,  There  are  two  types: 

1.  The  pair  of  hermitian  forms  z*Hi{r)z^  z*H2{r)z 

2.  The  size-2  minors  of  the  2x4  matrix  J 

All  the  equations  are  bihomogeneous.  The  two  equations  of  the  first  type 
are  independent,  of  bidegree  (1,1).  As  for  the  equations  of  type  two,  observe 
that  the  2x4  matrix  J  has  the  form: 

z*Hi{r)  z^Hijr)  .  . 

z*H2{r)  z’n^rj  ^  ^ 

Considering  all  the  size-two  minors  of  J,  we  find  that  there  are  the  six  equa¬ 
tions  of  the  second  type:  one  of  bidegree  (2,0),  four  of  bidegree  (1,1),  and 
one  of  bidegree  (0,2).  Consequently,  the  set  C  consists  of  eight  equations. 


J{r,  z,z)  = 
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It  turns  out  that  we  do  not  need  all  the  equations  in  C  to  perform  the 
elimination.  Let  B  consist  of  those  six  equations  of  bidegree  (1,1)  •  Iii  f^'Ct, 
there  is  no  loss  of  generality  working  with  B  instead  of  C.  To  see  this,  consider 
the  2x2  submatrix  formed  by  the  first  two  columns  of  J.  Note  that 


z*Hi{r)z 

■  ■ 

z^H2{r)z 

z*H2{r) 

(24) 


The  two  forms  in  the  vector  on  the  left-hand  side  of  equation  24  are  in  B. 
Therefore,  if  z  is  in  the  zero  set  of  B,  the  determinant  of  the  matrix  on  the 
right-hand  side  of  equation  24  must  be  zero.  The  determinant  of  that  matrix 
is  one  of  the  two  equations  in  C\B.  Similarly,  by  taking  the  conjugate  of 
equation  24,  we  see  that  the  determinant  of  the  last  two  columns  of  J  (the 
other  equation  in  C\B)  must  vanish  as  well.  We  conclude  that  the  zero  set 
of  C  is  the  same  as  the  zero  set  of  B. 

Define  ^{z,z)  by 


■  -21^1  ■ 
ZiZi 

Z2ii 
.  Z2Z2  . 


(25) 


There  is  a  6  x  4  matrix  >l(r)  consisting  of  (in  general)  quadratic  functions  of 
r  such  that  the  six  equations  in  B  may  be  written  in  the  form  of  equation  22. 
For  general  matrices  M  the  derived  matrix  A{r)  will  be  rank  four  except  for 
those  values  of  the  parameter  r  in  a  finite  set,  denoted  S.  From  the  matrix 
A(r)  one  can  derive  a  polynomial  p(r)  of  minimal  degree  whose  roots  are  the 
r-values  in  E.  In  Appendix  A  the  polynomial  p(r)  is  derived. 

Suppose  the  set  of  real  r-values  in  S  is  known.  Let  ro  be  a  value  in  S. 
We  will  show  how  the  vector  z  can  be  recovered. 

Let  y  be  a  nonzero  vector  such  that 


A(ro)y  =  0 


(26) 


Using  the  four  entries  of  V,  form  the  2x2  matrix  Q: 


If  the  value  Tq  corresponds  to  a  solution  of  B,  the  matrix  Q  is  rank-one.  If 
this  condition  is  satisfied,  perform  the  dyadic  decomposition: 


Q  = 


Zi 

Z2 


[Z1Z2] 


(28) 


and  the  vector  {zi,  Z2)  is  a  solution  of  the  equations  B  for  r  =  tq. 


3.2  The  Case  AT  =  3 

For  N  =  Z  the  approach  is  similar  to  the  case  N  =  2.  Again,  we  define  a  set 
C  consisting  of  two  types  of  equations: 

1.  The  three  hermitian  forms  z*Hi{r)z,  z*H2{r)z,  z*H3{r)z 

2.  The  size-3  minors  of  the  3x6  matrix  J 

Again,  the  equations  of  the  first  type  are  independent,  of  bidegree  (1,1).  This 
time,  however,  the  equations  of  type  two  have  bidegree  (ij),  where  i+j  =  3. 

We  derive  from  C  a  set  S  of  bihomogeneous  equations  of  bidegree  (2,1). 
First,  we  have  the  nine  equations  obtained  by  multiplying  each  of  the  type- 
one  equations  by  each  of  zi,Z2,Z3,  To  this  set  we  add  the  nine  minor  deter¬ 
minants  from  J  of  bidegree  (2,1).  Constructed  in  this  way,  the  set  B  consists 
of  18  equations  of  bidegree  (2,1). 

By  equation  21,  the  dimension  of  P3(2, 1)  is  18.  The  equations  in  B  form 
a  system  of  18  equations  in  the  18  variables  that  span  P3(2, 1).  Picking  a 
basis,  let  $(z,  z)  denote 

$(z,  z)  =  [ziii,  zlz2,  z\z3,  Z1Z2Z1 , . . . ,  z|z3]^  (29) 

With  respect  to  this  vector  $  we  can  write  the  matrix  A{r)  of  equation  22. 
It  is  18  X  18,  nine  of  its  rows  are  affine  in  r  while  the  other  nine  are  cubic. 
The  determinant  of  the  matrix  A.(r)  is  a  polynomial  p(r)  of  degree  36. 

We  have  shown  by  numerical  computations  that,  for  some  matrices  M, 
the  polynomial  p(r)  obtained  in  this  way  is  not  identically  zero.  For  such  M, 
let  E  denote  the  set  of  real  roots  of  p(r).  For  a  given  Tq  €  S  we  can  recover 
the  vector  z  by  a  process  similar  to  the  case  N  =  2, 

Let  V  be  a  nonzero  vector  such  that 


A{ro)V  =  0 


(30) 
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Using  the  18  entries  of  V,  form  the  6x3  matrix  Q'. 


Q  = 


Vt  V2  U3  ■ 
U4  Us  Ue 
V7  Vs  Vs 
Uio  Vn  U12 
Ui3  Ul4  Us 
Vie  Vi?  Vis  . 


(31) 


If  the  value  tq  corresponds  to  a  solution  of  B,  the  matrix  Q  is  rank-one, 
hermitian.  If  this  condition  is  satisfied,  perform  the  dyadic  decomposition: 


Q  = 


yi 

y2 

ys 

y* 

ys 


[Z1Z2Z3] 


ye  j 


(32) 


The  solution  vector  {zi,Z2,Z3)  can  be  obtained  by  taking  the  conjugate  of 
the  right  dyadic  component  of  Q,  or  by  computing  the  dyadic  factors  of  the 
matrix  P  given  by: 


P  = 


yi  y2  ys 

t/2  y4  ys 
.  ys  ys  ye 


(33) 


The  matrix  P  should  be  rank-one,  symmetric,  if  ro  corresponds  to  a  solution 
of  P.  In  that  ca^e  the  decomposition 


P  = 


Z\ 

Z2 

Zz 


(34) 


should  yield  a  vector  z  that  is  proportional  to  the  conjugate  of  the  right 
dyadic  component  of  Q. 


3.3  The  Case  AT  =  4 

The  procedure  for  iV  =  4  is  similar  to  that  required  for  iV  =  2  and  N  = 
3.  This  time,  the  set  B  consists  of  bihomogeneous  polynomials  of  bidegree 
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(3,3).  One  subset  of  equations  is  obtained  by  multiplying  the  four  equtions  of 
bidegree  (1,1)  by  the  100  basis  monomials  in  P4(2,2),  yielding  400  equations 
in  B.  The  second  subset  is  obtained  by  multiplying  the  36  4  X  4  minors  of  J 
of  bidegree  (2,2)  by  the  16  basis  monomials  of  P4(l,  1),  yielding  another  576 
equations  of  the  same  type. 

The  space  ^4(3, 3)  has  dimension  400,  so  the  matrix  j4(r)  has  size  976  x 
400.  By  numerical  computation  we  have  verified  that  there  are  4x4  matrices 
M  for  which  the  matrix  A(r)  has  rank  400  for  generic  values  of  r.  There  is 
a  polynomial  p(r)  whose  roots  form  the  set  S  of  r-values  for  which  the  rank 
of  A(r)  is  less  than  400.  At  the  moment,  we  do  not  know  the  degree  of  this 
polynomial,  though  we  suspect  (for  reasons  that  will  be  presented  later)  the 
degree  is  272. 

The  value  of  the  2-vector  can  be  computed  by  methods  similar  to  those 
described  in  the  cases  N  =  2  and  N  =  3.  A  numerical  example  is  presented 
in  Section  8. 

3.4  The  Case  >  4 

We  have  not  attempted  to  compute  any  examples  for  the  case  A  =  5  or 
larger.  We  suspect  these  cases  can  be  handled  by  a  similar  computational 
procedure,  but  we  have  no  proof. 

The  degree  of  the  polynomial  p(r)  in  the  case  TV  =  5  is  believed  to  be 
2150.  A  formula  for  the  (suspected)  degree  of  p{r)  for  N  >  A  is  presented  at 
the  end  of  the  next  section  on  hyperdeterminants. 
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4  Hyperdeterminants 

The  r-polynomial  has  a  close  tie  with  the  theory  of  hyperdeterminants.  Hy¬ 
perdeterminants  were  invented  by  Cayley  [10],  a  modern  treatment  is  pre¬ 
sented  in  [11].  In  this  report  we  will  discuss  only  those  facts  about  hyper¬ 
determinants  needed  to  help  understand  the  r-polynomial.  The  reader  is 
referred  to  [11]  for  proofs  of  the  results  stated  here. 

First,  some  basic  definitions.  For  an  integer  >  0,  suppose  we  are  given  a 
collection  of  numbers  indexed  by  integers  i,  j  and  k  each  running  from  1  to 
N.  Let  W  =  \Wijk]  denote  such  a  collection.  We  call  W  a  three-dimensional 
matrix  of  size  N.  The  numbers  that  constitute  W  can  be  pictured  in  a 
three-dimensional  array  -  a  generalization  of  the  planar  array  of  a  standard 
(two-dimensional)  matrix.  Three-dimensional  matrices  of  size  N  often  arise 
when  a  three-index  tensor  is  written  relative  to  a  basis  of  the  underlying 
JV-dimensional  vector  space. 

We  will  be  using  basic  facts  about  discriminant  polynomials.  We  consider 
a  general  form  f  over  and  its  zero-set,  the  hypersurface  Z{f)  of  the 
projective  space  P^~^.  If  f  is  a  form  of  degree  d  then  the  gradient  of  f,  V/, 
is  a  vector  of  forms  homogeneous  of  degree  d  —  1.  For  a  generic  form  /  there 
are  no  points  z  £  Z{f)  such  that  V/(z)  =  0.  In  that  case,  the  hypersurface 
Z(f)  is  a  smooth  manifold  of  dimension  iV  —  2  in 

For  some  forms  /  the  hypersurface  Z{f)  is  not  smooth  (i.e.  Z{f)  is 
singular).  There  is  an  irreducible  polynomial  in  the  coefficients  of  f,  called 
the  discriminant  of  f,  which  vanishes  if  and  only  if  there  is  a  point  in  Z{f) 
where  the  gradient  of  f  vanishes. 

The  general  theory  of  discriminants  has  been  well  studied;  much  is  known 
about  them.  It  is  a  classic  result  that  the  discriminant  of  a  homogeneous 
form  f  of  degree  d  over  is  a  polynomial  of  degree  N{d  —  1)^“^  in  the 

coefficients  of  f  (see  [12],  p.  99). 

We  are  now  prepared  to  discuss  hyperdeterminants  of  three-dimensional 
matrices.  Though  we  do  not  present  an  explicit  construction,  the  hyperde¬ 
terminant  of  W  is  a  polynomial  function  9(VF)  of  the  values  {Wijfc}.  The 
significance  of  the  hyperdeterminant  polynomial  is  as  follows: 

Lemma  1  There  is  a  polynomial  g(W)  in  the  variables  {VF,jfc},  called  the 
hyperdeterminant  polynomial,  which  vanishes  if  and  only  if  there  are  three 


(  N  +  d-1 
[  N 
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nonzero  N -vectors  x,y,z  such  that: 

N 

VA:  Yi  WijkXiVj  =  0  (35) 

i,j=l 

N 

Vj  Y  V^ijkXiZk  =  0  (36) 

t,A;=l 

Vi  f;  Wiityjzi,  =  0  (37) 

j,k=l 

For  more  details  concerning  this  lemma  and  the  other  results  stated  in 
this  section,  the  reader  is  referred  to  [11]. 

If  the  entries  of  W  are  polynomial  functions  of  a  variable  r,  the  hyperde¬ 
terminant  of  VF  is  a  polynomial  in  r. 

The  motivation  for  the  nomenclature  “hyperdeterminant”  can  be  under¬ 
stood  by  a  comparison  with  the  ordinary  determinant  function  for  (two- 
dimensional)  matrices.  Given  an  iV  x  iV  matrix  A  there  is  an  associated 
bilinear  form  a: 

N 

a{x,y)  =  Y  (38) 

i,j=l 

Suppose  there  are  nonzero  vectors  x,y  such  that  a{x,y)  =  0  and 

da{x,y)/dx  =  0  da{x,y)/dy  =  0  (39) 

or,  equivalently: 

V;  Y>  X)  0  (40) 

«=i  j=i 

Either  of  the  two  equations  in  40  imply  that  the  determinant  of  the  matrix 
A  must  vanish.  Conversely,  given  that  the  determinant  of  A  is  zero,  we  can 
find  a  pair  of  nonzero  vectors  x,  y  such  that  equation  40  is  satisfied. 

By  the  argument  just  given,  we  see  that  the  determinant  of  the  matrix  A  is 
exactly  the  discriminant  of  the  associated  bilinear  form  a.  The  analogy  with 
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hyperdeterminants  can  now  be  made.  Associated  with  the  three-dimensional 
matrix  W  of  size  N  is  &  tri-linear  form  u{x,  y,  z): 

N 

ij{x,y,z)=  WijkXiyjZk  (41) 

The  discriminant  of  the  form  u>  is  the  hyperdeterminant  of  W. 

Having  vaguely  defined  the  hyperdeterminant  of  a  three-dimensional  ma¬ 
trix,  we  now  show  how  the  basic  equations  of  Section  2  lead  to  a  hyperdeter- 
minantal  condition. 

Recall  the  basic  equations: 


\/k  z*Hkir)z  =  0 

and  the  rank  condition,  that  J{r,z,z)  defined  by: 


J{r,z,z)  = 


[  z*HN{r)  z^'HNir)  J 


(42) 


(43) 


should  be  rank  less  than  N. 

From  the  rank  condition  we  can  derive  the  following  lemma 

Lemma  2  Let  [r,z,z),  r  £  R,  z  ^  0,  be  a  point  where  the  matrix  J{r,z,z) 
is  rank  <  N.  Then  there  is  a  nonzero,  real  vector  t  €  such  that 

N  N 

tkHk{r)z  =  0  53  tkZ*Hk{r)  =  0  (44) 

fe=i  fc=i 


Proof  of  Lemma  2:  Because  J  is  rank  deficient,  there  exists  a  nonzero 
t  G  such  that 


f;  tkZ*Hk{r)  =  0  f;  tkzHhii^  =  0  (45) 

jfe=i  *=1 

Equivalently,  talcing  the  conjugate-transpose  of  the  first  equation  and  the 
transpose  of  the  second  (and  using  the  fact  that  Hk{r)  is  hermitian): 

N  N 


53  tkHk{r)z  =  0  5Z  tkHk{r)z  =  0  (46) 

jfe=l  k=l 
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Considering  these  two  equations  together,  we  see  that  the  vector  t  can 
be  replaced  equally  well  by  both  its  real  part  and  its  imaginary  part  in  all 
the  equations  above.  Because  t  is  nonzero,  one  of  these  two  real  vectors  is 
nonzero.  We  conclude  that  for  some  nonzero  real  vector  t  both  equations  44 
are  satisfied.  □ 

Define  the  three-dimensional  matrix  W  by: 

mjk]  =  (47) 


Now  replace  z  by  an  independent  variable  y.  The  original  N  equations  then 
have  the  form: 

N 

yk  =  0  (48) 

i,i=l 

With  this  same  substitution,  the  two  equations  in  the  previous  lemma 
become: 

N  N 

Vi  X!  ^ijkZjtk  =  0  Vj  WijkVitk  =  0  (49) 

i,k=l 

We  see  that  the  hyperdeterminant  of  the  three-dimensional  matrix  W 
must  vanish.  The  hyperdeterminant  of  W  is  a  polynomial  q{r),  one  of  whose 
roots  is  We  conclude  that  the  r-polynomial  p{r)  divides  the  hyperde¬ 

terminant  ^(r). 


Remark  1  In  our  argument  to  show  that  the  hyperdeterminant  of  W 
vanishes,  no  use  was  made  of  the  fact  that  the  vector  t  may  be  chosen  real. 
The  reality  condition  on  t  is  significant,  however,  when  we  consider  whether 
the  hyperdeterminant  q{r)  is  the  same  as  the  r-polynomial  p(r).  Consider  a 
real  value  Tq  at  which  the  hyperdeterminant  of  W  vanishes,  and  let  Sro  be 
the  associated  set  of  (y,z,t)  satisfying  the  system  of  equations  in  Lemma  []. 
Then  Tq  is  a  root  of  p(r)  if  and  only  if  some  point  in  Sro  satisfies  the  reality 
conditions: 


0,^J,  =  z  0,4ieR” 


(50) 


In  general,  we  do  not  know  whether  the  degree  of  q(r)  will  equal  the  degree 
of  the  r-polynomial  p(r). 


We  are  now  in  a  position  to  take  advantage  of  theoretical  results  concern¬ 
ing  hyperdeterminants.  One  such  result  is: 
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Lemma  3  Let  q{W)  denote  the  hyperdeterminant  of  the  three-dimensional 
matrix  W  of  size  N.  The  degree  ofq{W)  is: 


deg(q)  =  ,  0'  +  ^)-' - 


(51) 


This  formula  is  Corollary  2.9  on  page  456  of  [11].  For  =  2,  3,  4  and  5  the 
values  are  4,  36,  272  and  2150  respectively. 

It  remains  an  interesting  open  question  whether  the  r-polynomial  is  equiv¬ 
alent  to  the  hyper  determinant  polynomial  for  all  N. 
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5  Families  of  Hyp ersur faces 

The  hyperdeterminant  approach  discussed  in  the  previous  section  has  certain 
theoretical  advantages,  but  so  far  we  have  not  been  able  to  make  use  of  it 
in  an  efficient  computational  method.  An  approach  that  lends  itself  more 
readily  to  computations  is  the  subject  of  this  section.  The  method  described 
below  was  first  applied  in  [2]. 

We  adopt  the  notation  used  in  the  statement  of  Lemma  2  in  Section  4. 
Suppose  r  is  a  real  variable,  sr  is  a  complex  vector  variable  and  t  is  a  real 
vector  variable.  Define  H(t,  r)  to  be  the  N  x  N  matrix  of  forms: 

Hit,r)  =  YlikHkir)  (52) 

ik=l 

An  immediate  corollary  of  Lemma  4.2  can  be  stated: 

Corollary  1  Let  fr{t)  denote  the  polynomial  function: 

Mt)  =  Det{H{t,r))  (53) 

Consider  r  to  be  fixed,  so  that  fi  is  a  form  of  degree  N  in  the  vector  t.  If 
there  exists  a  nonzero  vector  z  such  that  (r,z,z)  satisfies  the  conditions  of 
Lemma  2,  then  the  discriminant  of  fr  vanishes.  That  is,  there  is  a  nonzero 
vector  to  such  that: 

\/i  ^^=0  (54) 

Proof  of  Corollary:  Fix  z  and  r  as  above,  and  select  an  invertible  N  x  N 
matrix  U  in  which  z  is  the  first  column.  Consider  the  matrix  of  forms  L{t) 
(we  suppress  the  dependence  on  r,  which  is  fixed): 

L{t)  =  U*Hit,r)U  (55) 

and  note: 

Det{L{t))  =  \Det{U)\'^Det{H{t,  r))  =  lDet(17)lV,(t)  (56) 

Observe  that  L{t)  is  of  the  form: 

=  ( a(t)  m ) 
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where  a  and  a*  are  linear  forms  with  values  in  ^  such  that  a(to)  =  0 
and  a* (to)  =  0.  An  elementary  computation  reveals,  for  all  t: 

Det(L(t))  =  a*(t)A(t)^a(t)  (58) 

where  A(t)^  is  the  matrix  of  cofactors  (adjoint  matrix)  of  A{t).  Examining 
the  first-order  behavior  of  this  last  expression  at  to  we  find  that 

Vi  =  0  (59) 

C/l'f 

It  follows  that  the  discriminant  of  /,  vanishes.  □ 

This  corollary  provides  an  advantage  from  a  computational  viewpoint. 
The  advantage  is  that  the  discriminant  condition  on  the  form  fr  leads  to  an 
elimination  of  the  N  variables  U  from  a  system  of  N  homogeneous  equations. 
The  goal  of  the  elimination  is  a  polynomial  pd{r)  whose  roots  we  compute 
numerically.  By  comparison,  the  basic  equations  derived  in  Section  2  require 
elimination  of  2N  variables,  while  the  hyperdeterminant  approach  of  Sec¬ 
tion  4  requires  elimination  of  dN  variables.  In  general,  the  computational 
difficulty  in  performing  an  elimination  increases  rapidly  with  the  number  of 
variables  to  be  eliminated,  so  the  advantage  of  computing  Pd{r)  instead  of 
the  other  eliminants  is  clear.  The  disadvantage  is  that  this  approach  requires 
modification  in  the  case  where  N  >  Z. 

This  corollary  provides  a  nice  geometric  view  of  the  problem.  Thinking  of 
fr  as  h  family  of  forms,  parametrized  by  the  variable  r,  there  is  the  associated 
1-parameter  family  of  algebraic  zero  sets  Vr  of  the  projective  space 
The  nonvanishing  of  the  discriminant  pd(r)  is  associated  with  the  geometric 
condition  that  VJ-  is  a  smooth  submanifold  (nonsingular  variety).  As  the 
parameter  r  varies,  we  can  imagine  the  family  Vr  deforming  continuously 
through  smooth  varieties  for  an  open  set  of  values  of  r,  with  occasional 
singular  sets  occurring  at  the  roots  of  Pd(r).  This  simple  geometric  picture 
is  not  exactly  correct,  as  we  shall  see,  but  the  concept  can  be  made  precise. 

Because  the  converse  of  the  corollary  is  not  true,  the  polynomial  pd{r) 
is  not  the  same  as  the  p-polynomial  p{r).  All  we  can  conclude  is  that  the 
polynomial  we  really  care  about,  p(r),  is  a  factor  of  the  more  easliy  computed 
polynomial  Pd{r). 

This  lemma  is  applied  in  Section  6  to  the  two  tractable  cases  N  =  2  and 
jV  =  3.  The  difficulty  in  the  case  iV  >  3  is  discussed  along  with  some  related 
general  theory  in  Section  7. 
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6  Families  of  Hypersurfaces  —  Low  Dimen¬ 
sional  Computations 

In  this  section  we  apply  the  family-of-hypersurfaces  approach  to  look  at  the 
cases  N  =  2  and  N  =  Z.  At  the  end  we  explain  a  complication  that  arises 
in  applying  this  technique  to  the  cases  N  >  Z.  As  in  the  previous  section,  r 
is  a  real  variable  and  t  is  a  real  vector  variable.  Given  the  N  x  N  matrix  of 
forms:  ^ 

H{t,r)  =  j2^kHk{r)  (60) 

ik=i 

we  are  interested  in  the  polynomial  fr{t)  defined  by 

Mt)  =  DetiH{t,r))  (61) 

and  its  gradient  with  respect  to  t. 


6.1  The  Case  N  =  2 

In  the  case  N  =  2  the  polynomial  fr{t)  has  the  form: 

frit)  =  C2Q{r)t\  +  cii(r)tit2  +  co2(r)t2  (62) 

where  each  of  the  functions  c,j(r)  is  quadratic  in  r^. 

For  each  value  of  r  the  zero-set  of  fr  defines  a  pair  of  points  in  the 
projective  line  P^.  The  vanishing  of  the  discriminant  is  equivalent  to  the 
geometric  condition  that  the  two  points  are  coincident. 

The  two  algebraic  equations  are: 

0  ==  =  2c2o(r)ti  -f  cn(r)t2  (63) 

0  =  =  cn(0h  +  2co2(r)f2  (64) 

This  pair  of  equations  can  have  a  nontrivial  solution  only  if  the  2x2  coefficient 
matrix  C'(r)  defined  by: 


Civ) 


2c2o(r)  cii(r)  \ 
cii(r)  2co2(r)  ) 


(65) 
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is  singular.  The  discriminant  polynomial  prf(r)  is: 

Pd{r)  =  Det{C{r))  =  4c2o(r)co2(r)  -  cii(r)^  (66) 

which  is  quartic  (degree  4)  in  There  are  at  most  four  nonnegative  real 
roots  Tj  of  pd,  one  of  which  is  the  value  p{M).  It  can  be  shown  that,  in  this 
case,  p(M)  is  the  largest  real  root  of  pd  (see  Appendix  A). 

6.2  The  Case  iV  =  3 

In  the  case  N  =  S  the  polynomial  /,(<)  has  the  form: 

fr{t)  =  C21o(r)<it2  +  C20l(»“)tit3  +  Cl2o(r)tlt2  +  Clll(r)<i<2<3  + 

cio2(r)iit|  +  co2i(r)t^t3  +  coi2(r)Ml  (67) 

where  each  of  the  functions  Cijk{r)  is  cubic  in  There  are  no  terms  of  the 
form  because  each  matrix  coefficient  of  tj  in  the  form  has  rank  at 

most  2  (as  can  be  verified  by  examining  the  construction  of  H{t^r)). 

For  each  value  of  r  the  zero-set  of  fr  defines  a  cubic  curve  in  the  projective 
plane  P*.  The  vanishing  of  the  discriminant  is  equivalent  to  the  geometric 
condition  that  the  curve  is  singular. 

The  three  algebraic  equations  are: 

0  =  ~  2c2io(r)tit2+2C20l(r)<if3-fCi2o(j’)t2+^lll(^)^2t3+Cl02(r)t3  (68) 

0  =  —  C2io(r)ti-|-2ci2o(r)ti<2+ciii(r)tit3-|-2co2i(r)t2t3-f coi2(r)t3  (69) 

®  ~  C20l(r)<i-l-Ciii(r)tit2+2Ci02(r)<lt3-l-C02l(r)t2+2C0l2(j')t2t3  (70) 

013 

Each  of  the  three  equations  above  can  be  used  to  derive  four  linear  equations 
in  the  set  W  of  monomials; 


W  =  {tlt2,  tits,  tltl,  fjMs,  tltl,  titl,  htlts,  tit2tl,  titl,  tits,  tjtl,  t2tl}  (71) 

For  example,  equation  68  can  be  multiplied  by  each  of  the  four  monomials 
{tl,tit2,tits,t2ts}  with  the  result  linear  in  the  monomials  of  W.  By  this 
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dialytic  technique  we  obtain  a  set  of  twelve  linear  equations  in  the  twelve 
monomials  of  W. 

These  twelve  equations  can  have  a  nontrivial  solution  only  if  the  12  x  12 
coefficient  matrix  C(r)  defined  by: 
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is  singular.  The  discriminant  polynomial  Pdir)  is: 

Pdir)  =  Det{C{r))  (72) 

which  is  degree  36  in  r*.  There  are  at  most  36  nonnegative  real  roots  rj  of 
Pdj  one  of  which  is  the  value  p{M). 

Unfortunately,  in  this  case  the  rank  deficiency  of  C{r)  at  r  =  Tq  is  nec¬ 
essary  but  not  sufficient  for  tq  to  be  a  root  of  the  //-polynomial.  There  is 
a  Zariski-closed  set  of  3  x  3  matrices  M  for  which  the  matrix  C{r)  is  rank- 
deficient  for  all  r.  It  is  true,  however,  that  a  converse  can  be  derived  for 
3x3  matrices  M  in  the  open  set  for  which  C{r)  is  full-rank  for  at  least  one 
value  of  r.  This  issue  and  the  complexity  of  the  general  situation  is  discussed 
further  in  the  next  section. 

6.3  The  Cases  iV  >  3 

In  the  case  TV  =  4  the  polynomial  fr{t)  has  the  form: 

fr{t)  =  C220o{f')t\t\  +  C2ilo(r)tiM3  H - H  Co022(^)^3i4  (73) 
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where  each  of  the  functions  c,jjt/(r)  is  quartic  in  r^.  As  in  the  case  N  =  3 
there  are  no  monomials  including  terms  of  the  form  for  m  >  2  because 
each  matrix  coefficient  of  tj  in  the  form  H[t,r)  has  rank  at  most  2. 

For  each  value  of  r  the  zero-set  of  fr  defines  a  quartic  surface  in  the 
projective  space  P^.  The  vanishing  of  the  discriminant  is  equivalent  to  the 
geometric  condition  that  the  surface  is  singular. 

If  we  try  to  proceed  as  in  the  case  N  =  3,  we  eventually  find  that  the 
dialytic  method  applied  to  the  discriminant  of  fr  does  not  work.  The  prob¬ 
lem  is  that,  for  all  r,  the  zero  set  of  fr  is  a  singular  hypersurface  in  P®. 
The  coordinate  vertices  (1,0, 0,0),  (0,1,0,0),  (0,0,1,0)  and  (0,0,0, 1)  are  always 
singular  points. 

The  coordinate  vertices  are  singular  points  for  the  Z(/r)-hypersurfaces 
for  all  N  >  3.  For  iV  =  4  they  are  double  points,  for  N  —  5  they  are  triple 
points,  etc. 

There  are  techniques  for  dealing  with  this  situation;  we  shall  discuss  some 
of  them  in  the  next  section. 


31 


7  Families  of  Hypersurfaces  —  General  Re¬ 
sults 

In  this  section  we  examine  more  closely  the  family-of-hypersurfaces  approach 
to  general  cases.  As  in  the  previous  section,  r  is  a  real  variable  and  t  is  a 
real  vector  variable.  Given  the  N  x  N  matrix  of  forms: 

=  (74) 

il:=l 

we  are  interested  in  the  polynomial  /,(<)  defined  by 

Ut)  =  Det(H{t,r))  (75) 

and  its  gradient  with  respect  to  t. 


7.1  The  Case  N  =  4 

As  mentioned  in  the  previous  section,  in  the  case  N  =  4=  the  polynomial  fr(t) 
has  the  form: 


fr{t)  —  +  C211o(r)ti<2^^  d - H  Co022(^)<3^4 

where  each  of  the  functions  Cijki(r)  is  quartic  in  r*. 

For  each  value  of  r  the  zero-set  of  fr  defines  a  singular  quartic  surface 
Z(fr)  in  the  projective  space  P^.  Fix  a  value  of  r  and  consider  the  geometry 
of  this  surface  V  =  Z{fr).  The  values  of  r  associated  with  the  r-polynomial 
are  associated  with  singular  points  on  F  of  a  special  type.  In  the  following  we 
will  determine  more  precise  conditions  that  this  singular  point  must  satisfy 
to  be  associated  with  the  r-polynomial. 

As  we  observed  eaxlier,  for  all  values  of  r  the  coordinate  vertices  (1,0,0,0), 
(0, 1,0,0),  (0,0, 1,0)  and  (0,0,0,1)  are  singular  points  of  V  so  the  discriminant 
of  fr  vanishes  identically.  Consequently,  Corollaxy  1  of  Section  5  provides  no 
information  about  the  value  of  fi.  Corollary  1  can  be  strengthened  by  adding 
a  condition  that  eliminates  these  coordinate- vertex  singular  points  and  others 
like  them.  In  this  subsection  we  show  how  the  result  can  be  strengthened  in 
the  ca^e  =  4. 
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First  we  consider  some  general  fact  about  matrices.  Denote  by  M{k,  1) 
the  set  of  complex  k  x  k  matrices  of  rank  no  more  than  1.  Inside  of  M (4, 4) 
lies  the  codimension-4  algebraic  subset  Af  (4, 2)  of  those  matrices  of  rank  2 
or  less.  (In  general,  the  codimension  of  the  set  Af(fc,  /)  inside  of  M{k,  k)  for 
k  >  I  is  {k  —  ly  .)  Let  us  work  in  the  projective  space  associated  with 
the  nonzero  4x4  matrices. 

Inside  of  the  form  H(t,  r)  provides  an  imbedded  space  Q  isomorphic 
to  P®  X  P^.  This  space  Q,  being  four-dimensional,  should  be  expected  to 
intersect  M(4,2)  in  a  nonempty  set  of  points.  We  observe: 

Lemma  1  At  each  point  (t,  r)  in  Q  D  M(4, 2)  the  gradient  of  the  function 
fr{t)  vanishes. 

Proof  of  Lemma:  Let  Mi{t,r)  denote  the  column  of  H{t,r).  The 
partial  derivative  of  the  determinant  with  respect  to  tj  can  be  written: 

^  =  deti^M^M^M^)  -f  •  •  •  -b  det(MxM2M3^)  (77) 

Each  term  in  the  sum  on  the  right-hand  side  vanishes  because  any  three 
columns  M,(t,  r)  are  linearly  dependent.  □ 

It  is  the  presence  of  points  (t,r)  for  which  .ff(t,r)  has  excess  rank  defi¬ 
ciency  that  complicates  the  converse  of  Corollary  1,  Section  5.  The  next  two 
lemmas,  true  for  general  values  of  N,  clarify  the  problem. 

Lemma  2  Consider  the  case  of  general  N.  Suppose  that  for  some  real  value 
To  the  gradient  of  fr^  vanishes  at  a  real  vector  to  for  which  H{to,ro)  has  rank 
N  —  1.  Then  there  is  a  nonzero  complex  vector  z  such  that  {ro,z,z)  satisfies 
the  conditions  of  Lemma  2  of  Section  2;  that  is,  ro  is  an  r -value. 

Proof  of  Lemma:  Let  {to,ro)  satisfy  the  hypotheses  of  the  lemma,  and 
let  z{t)  be  any  column  of  the  adjoint  matrix  of  H{t,r)  such  that  z[to)  is 
nonzero  (there  is  such  a  column  because  of  the  rank  iV  —  1  assumption). 
Now  consider  the  function  g{t)  defined  by: 

g{t)  =  z{tyH{t,ro)z{t)  (78) 

The  polynomial  g{t)  has  fro{t)  as  a  factor,  so  we  know  that  g{to)  and 
=  0  for  all  ti.  On  the  other  hand, 
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(79) 


dg{to)  _  dz{t)*H{t,  rQ)z{t) 
dti  dti  * 

and  the  right  hand  side  is  exactly  z*Hi{ro)z.  □ 

The  situation  where  there  is  a  real  point  (toj^o)  at  which  the  matrix 
To)  loses  rank  by  more  than  1  is  more  complicated.  To  understand  this 
situation  we  need  to  introduce  two  additional  concepts.  We  restrict  attention 
to  the  case  where  the  rank  of  H{tQ,  tq)  is  W  —  2. 

First,  we  have  the  N  x  N  Hessian  matrix  Hess{fr)  defined  by: 

Hess{fr)ij  =  (80) 

Because  of  the  rank  N  —  2  assumption  the  matrix  ifess(/ro),j(to)  is  nonzero. 
Because  (to,  ^o)  is  real,  the  hessian  matrix  is  Hermitian. 

Second,  for  an  N  x  N  matrix  H  denote  by  cn^2{H)  the  sum  of  the 
principal- minor  determinants  of  size  {N  —  2)  x  (iV  —  2).  Again,  by  the  as¬ 
sumptions  made  we  know  that  cpf.2{H{to^ro))  is  a  nonzero  real  number. 


Lemma  3  Consider  the  case  of  general  N.  Suppose  that  for  some  real  value 
ro  the  gradient  o//ro  vanishes  at  a  real  vector  to  for  which  H(to,ro)  has  rank 
N  —  2.  Then  Tq  is  an  r -value  if  and  only  if  the  normalized  Hessian: 


Hessifrojto)) 

CN^2{H{to,ro)) 


is  negative-semidefinite,  rank  no  more  than  2. 


(81) 


Proof  of  Lemma:  Let  (to,  ^o)  satisfy  the  hypotheses  of  the  lemma.  Then 
we  may  choose  a  4  x  4  complex  matrix  S  of  determinant  1  such  that  the  matrix 
of  forms  $(t)  defined  by: 


$(t)  =  5*if(t,ro)5  = 


A{t)  Bit) 
B*ii)  Bit) 


(82) 


satisfies  the  conditions  >l(to)  =  0,  B(to)  =  0  =  B*{to)  where  A{t)  is  2  x  2, 
B{t)  is  2  X  (A^  —  2)  and  B*(t)  is  the  conjugate  transpose  of  the  form  B(t). 

Observe  that  ro  is  an  r-value  if  and  only  if  for  some  real  to  as  above  the 
(1,1)  entry  of  A(t)  is  the  zero  form  (the  first  column  of  the  matrix  S  is  the 
vector  z  satisfying  z*Hj{ro)z  =  0). 
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It  is  a  simple  computation  to  verify  that  the  normalized  Hessian  matrix 
Hesso  is  exactly  the  Hessian  of  the  2x2  determinant  function  Det{A{t)). 
We  will  show  that  a  constant  S  exists  making  the  zero  form  if  and  only 

if  the  Hessian  of  Det(A{t))  is  negative  semi-definite,  rank  no  greater  than  2. 

First,  if  An{t)  =  0,  observe  that 

Det{A{t))  =  —Ai2{t)A2i{t)  =  — |i4i2(t)P  (83) 

is  a  real,  negative  semidefinite  quadratic  form.  It  vanishes  on  the  set  of  t  such 
that  Ai2{t)  =  0.  Because  A12  is  a  complex  valued  linear  form,  the  dimension 
of  its  kernel  is  at  least  N  —  2  -  hence  the  rank  of  the  Hessian  form  is  no 
greater  than  2. 

Conversely,  suppose  A(t)  is  a  2  x  2  Hermitian  matrix  of  linear  forms 
defined  on  t  in  R^,  and  that  Det{A{t))  is  negative  semidefinite,  rank  no 
more  than  2.  We  want  to  show  there  is  a  2  x  2  matrix  L  of  determinant  1 
such  that 


L*A(t)L  —  ^ 


Without  loss  of  generality  we  can  write  A{t)  in  the  form: 

-  bit)  -  y/=lcit)  ait)  -  dit) 

where  a(t),6(t),c(t),d(f)  are  real  linear  forms.  With  this  notation, 

BetiAit))  =  a(t)^  —  —  c(t)^  —  (86) 

So  the  determinant  of  A(t)  is  the  square  of  the  Minkowski  pseudo-norm  of 
the  vector  of  real  linear  forms  (a,6,c,d).  For  any  2x2  complex  matrix  L  of 
determinant  1  the  transformation  gi,  defined  by: 


g^Ait))  =  rAit)L  = 


a'it)  -f  d'it)  b'it)  -1-  y/^cfit) 
b'it)  —  y/^dit)  a'it)  —  d'it) 


induces  a  mapping 


GLia,b,c,d)  =  ia',b',c',d') 
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which  preserves  the  Minkowski  pseudo-norm.  In  fact,  this  induced  mapping 
is  the  covering  map  of  the  Lie  group  SL(2,  C)  over  the  identity  component 
of  the  Lorentz  group. 

The  condition  that  the  form  Det{A{t)  has  rank  no  more  than  2  implies 
that  the  space  of  forms  {a,b,c,d)  spans  a  space  W  of  dimension  no  greater 
than  2.  Suppose  the  subspace  is  two  dimensional.  By  the  negative  semidefine 
assumption  we  know  that  the  Minkowski  inner  product  is  non-positive  on 
W.  The  converse  is  proved  if  we  can  we  find  a  Lorentz  transformation  that 
maps  W  into  the  three-dimensional  subspace  a{t)  —  d(t)  =  0.  But  such  a 
transformation  is  easy  to  find  using  the  properties  of  the  Lorentz  group.  □ 
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8  A  Computational  Example 


In  this  section  we  present  a  numerical  example  of  the  computation  of  fi  for  a 
particular  4x4  matrix  M.  This  matrix  M  is  one  member  of  a  family  discussed 
in  [13]  -  its  significance  is  discussed  in  that  reference.  Among  other  things, 
it  is  rank  2,  with  both  nonzero  singular  values  equal  to  1. 

To  represent  complex  numbers  we  adopt  the  notation  (x,  y)  to  denote  the 
number  x  +  y/^V-  The  matrix  M  is: 


M  = 


(0.0000,0.0000) 

(0.2989,0.0000) 

(0.0000,0.2989) 

(0.3493,0.3493) 


(0.2989,0.0000) 
(0.0000,0.0000) 
(0.2113,-0.2113) 
(-0.4278,  -0.2470) 


(0.2989,0.0000) 

(0.2113,0.2113) 

(0.0000,0.0000) 

(-0.4278,0.2470) 


(0.3493,0.3493) 

(0.4278,0.2470) 

(0.2470,0.4278) 

(0.0000,0.0000) 


According  to  the  algorithm  described  at  the  end  of  Section  3  a  computer 
program  was  used  to  generate  the  976  x  400  matrix  A(r)  at  values  along  a 
grid  in  the  r- space.  A  plot  of  the  two  singular  values  <Ti  and  (T400  is  shown 
below. 


8  bounds  3  1 


Time 

Observe  the  dip  in  the  value  of  the  400**  singular  value  at  the  value  tq  = 
0.841.  The  minimum  singular  vector  of  A(ro)  was  computed  and  the  derived 


matrices  were  found  to  be  (approximately)  rank  1.  The  corresponding  vector 
z  at  r  =  0.841  is: 

■  (0.12997134685973169010,0.00000000000000000000)  ‘ 
(0.51950242727553730404, 0.00000052358896364930) 

^  ~  (0.51950242727553708200,-0.00000052358896368336) 

.  (-0.66583924869638788646,0.00000000000000019122)  . 

The  deltas  are: 

■  (-0.84071373809592109261,0.84071373809592109261)  ' 

.  ^  (1.17970685258911189841,  -0.14884516617420087692) 

^  (1.17970685258911145432,0.14884516617420112672) 

.  (-0.84079248581452115108,0.84079248581452159517)  . 

The  matrix  /  —  MA  is: 


(1.0000,0.0000) 
(0.2513,  -0.2513) 
(0.2513,0.2513) 
(0.5873,0.0000) 


(-0.3526,0.0445) 

(1.0000,0.0000) 

(-0.2178,0.2807) 

(0.5414,0.2277) 


(-0.3526,  -0.0445) 
(-0.2178,  -0.2807) 
(1.0000,0.0000) 
(0.5414,  -0.2277) 


(0.5874,0.0000) 

(0.5674,-0.1520) 

(0.5674,0.1520) 

(1.0000,0.0000) 


To  verify  that  I  —  MA  is  singular  we  compute  its  singular  values: 


■  2.13191170117315254018  ' 
1.13696819871065679664 
0.99498302424291718005 
0.00000000000000004737 


Examination  of  the  plots  for  values  of  r  >  0.841  showed  that  A(r)  is 
nonsingular  for  larger  values  (we  needed  only  to  check  up  to  r  =  1,  because 
that  is  the  norm  of  M).  We  conclude  that  fi{M)  =  0.841. 

The  example  just  presented  illustrates  an  important  point  about  the  com¬ 
putation  of  r- values.  The  reader  might  recall  the  goal  stated  in  Section  1,  to 
derive  a  polynomial  whose  largest  real  root  is  the  value  Now  that  we 

have  come  to  an  example,  no  polynomial  was  produced. 

The  point  is  that  the  r-polynomial  is  associated  with  some  special  matrix 
problems  which  allow  special  methods  of  computations. 

Though  we  have  not  examined  the  numerical  properties  of  the  solution 
algorithms  for  these  special  types  of  problem,  we  have  had  good  experience 
with  them  in  practice.  Note  how  close  to  zero  the  smallest  singular  value  of 
I  —  MA  is  in  the  example.  More  is  said  about  this  issue  in  Section  10.1 
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9  Abstract  Interpretation 


Using  a  geometric  construction,  the  problem  of  determining  the  degree 
of  the  r-polynomial  for  general  N  can  be  formulated  in  terms  of  intersection 
theory  [14].  In  this  subsection  we  develop  that  formulation. 

We  begin  by  modifying  the  basic  equations 

z*Hk{r)z  =  0  jb  =  l,...,JV  (89) 

The  first  step  is  to  replace  5  by  an  independent  variable  w,  to  give; 

w^Hk{r)z  =  Q  k  =  (90) 

The  second  step  is  to  introduce  homogeneous  coordinates  (ri,r2)  in  place 
of  the  single  affine  coordinate  r: 


Hk{r)  =  riMkMk  -  r^elck  (91) 

These  two  changes  of  variable  make  the  equations  90  tri-linear  in  the 
three  vectors  z,  w  and  r.  We  now  interpret  these  equations  on  the  product 
X  of  three  projective  spaces: 

X  =  X  X  P^  (92) 

where  the  last  factor  corresponds  to  the  r-vector,  the  second  to  the  w- vector, 
and  the  first  to  the  z-vector.  By  the  Segre  imbedding  i  we  realize  X  as  a 
smooth  submanifold  of  the  projective  space 

i:X  P2^*-'  (93) 

The  system  of  equations  90  cuts  out  a  codimension- TV  linear  subspace  L 
of  p2^*-i.  The  intersection  of  X  with  i  is  a  subvariety  of  dimension  —  1, 
denoted  Y. 

To  proceed  with  the  analysis  we  need  to  introduce  some  vector  bun¬ 
dles.  First,  we  have  the  bundle  T{X),  the  bundle  of  tangent  vectors  of 
X.  Contained  in  T{X)  is  the  codimension-one  subbundle  E  consisting  of 
those  vectors  that  are  tangent  to  the  P^~i  x  P^~^  factors  of  X.  Under 
the  projection  map  tts  :  X  — >  P',  E  is  the  kernel  of  the  induced  morphism 
TTa*  :  T{X)  — >  r(P').  Now  the  bundle  T{X)  is  itself  contained  in  the  larger 
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bundle  the  restriction  of  the  tangent  bundle  of  the  ambient 

space  to  the  subvariety  X.  Finally,  over  the  space  L  we  have  the  bundle  F 
defined  to  be  Ti(P^^*“^)/T'(Z/).  The  bundle  F  is  iV-dimensional,  its  dual  is 
usually  called  the  normal  bundle  to  L  in 

All  of  the  bundles  just  described  can  be  restricted  to  Y.  Over  Y,  we  have 

E  ->  T{X)  F  (94) 


The  composition  morphism  c  :  E  F  of  vector  bundles  on  Y  is  full  rank 
N  at  most  points  of  Y.  There  is  a  special  subset  of  Y,  denoted  Z)jv-i(cr),  at 
which  the  rank  of  cr  is  less  than  or  equal  to  —  1.  Let  us  call  this  set 
£>7V-i  the  degeneracy  locus.  For  generic  data  the  degeneracy  locus  Dat-i  is 
a  codimension  —  1  subvariety  of  Y,  i.e.  a  finite  set  of  points.  It  is  that 
finite  set  of  values  {zi,Wi,ri)  we  are  after.  The  r-polynomial  p{r)  is  obtained 
by  eliminating  w  and  z  and  has  as  its  roots  {r,}. 

The  general  theory  does  not  tell  how  to  find  the  points  in  the  degeneracy 
locus,  but  (under  certain  conditions  of  genericity)  the  Porteus-Thom  formula 
[14]  can  be  used  to  compute  the  number  of  points  in  that  set.  In  the  remain¬ 
der  of  this  section  we  explain  the  general  computational  procedure  and  apply 
the  formula  for  small  values  of  N. 

The  notation  is  as  in  Chapter  14  of  [14].  The  degeneracy  class  of  a  is  an 
integral  cohomology  class  in  H*{Y,Z): 


Af-'(c(F  -  E))  =  Det 


Cl  Co 


CjV-1  CN-2  •  '  • 


The  number  of  points  in  the  degeneracy  locus  is: 


C3-JV 


Cl 


(95) 


card(DK-i  (<t))  =  Af'-'WF  -  E))  n[y]  (96) 

We  will  compute  this  intersection  number  by  pulling  back  to  the  space 
X.  Recall  that  H*(X,  Z)  is  a  truncated  polynomial  ring  on  three  generators 
hi,  h^,  hs  subject  to  the  relations: 


/if  =  0  h^  =  0  h^  =  0 


(97) 


Geometrically,  each  hj  is  Poincaire-dual  to  a  hyperplane  in  its  respective 
projective-space  factor.  In  this  ring  we  have  the  formula 
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(98) 


c(T(P"-'  X  P"-*)  =  (1  +  hif{l  +  k^f 
SO  the  pullback  of  the  class  oi  F  —  E  is: 

i*{c{F  -  E))  =  (1  +  (hr  +  h2  +  h^)f/({l  +  hr)il  +  h^))^  (99) 

Finally,  by  Poincaire  duality  and  the  naturality  of  Chern  classes  we  have: 


The  right-hand  side  of  this  last  equation  can  be  computed  from  the  (finite) 
power  series  expansion  for  the  rational  function  in  the  classes  hi,h2,h3,  where 
we  have 


h^-^h^-'^hslX]  =  1  (101) 

and  all  other  monomial  expressions  in  hi,  h2  and  h^  vanish  when  applied  to 
[X]. 


9.1  Computation  for  N  —  2 

In  the  case  N  =  2  the  expression  Aj(c(F’  —  E))  is  the  determinant  of  the 
1x1  matrix  ci(F  —  E).  From  equation  **: 

i*{c{F  —  E))  =  (1  +  (hi  4-  ^2  -f-  h3)Y/{{l  +  hi){l  -|-  h2)Y  (102) 

from  which  it  is  easily  shown  that  Ci{F  —  E)  =  2/13.  Then 

[{hi  +  h2  +  h3)^[ji*A\{c{F  -  E)))  [X]  =  Ahih2h3[X]  =  4  (103) 

We  conclude  there  are  four  r-values  for  the  case  N  =  2 


9.2  Computation  for  iV  =  3 

In  the  case  N  =  3  the  expression  A^(c(F  —  E))  is  the  determinant  of  the 
2x2  matrix 


Aj(c(F  -  E))  =  Det 


Cl  Co 
C2  Cl 


=  cl-  C2C0 


(104) 
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Now 


i‘(c{F  -  E))  =  (I  +  (A.  +  Aj  +  +  h)(l  +  h-,)f  (105) 

from  which  it  is  easily  shown  that: 

Co  =  1  Cj  =  3/13  C2  =  — Z{h\h2  +  h\h^  +  /J2^3)  (10^) 

Thus 

~  ^))  ~  ~  ^2Co  =  3(/li/l2  +  hihs  +  /l2^3) 

Then 

{{hi  +  h2  +  hsf  IJ  i*Al{c{F  -  E)))  [X]  =  SGhlhlhslX]  =  36  (108) 

We  conclude  there  are  36  r-values  for  the  case  N  =  Z 


9.3  Computation  for  N  >  3 

Using  the  Porteus-Thom  formula  and  a  symbolic  manipulation  program  (Math- 
ematica)  we  have  performed  the  computations  for  the  values  N  =  4:,  N  =  5 
for  which  the  answers  are  272  and  2150.  These  numbers  agree  with  those 
produced  by  formula  (51)  in  Section  4.  On  geometric  grounds  we  suspect 
that  the  number  produced  by  the  hyperdeterminantal  formula  will  always 
agree  with  the  number  provided  by  Porteus-Thom  -  we  believe  both  num¬ 
bers  are  the  degree  of  the  r-polynomial  for  generic  N  x  N  complex  matrices 
M.  A  rigorous  proof  of  this  identity,  if  true,  requires  a  deeper  understanding 
of  the  theory  than  we  now  have. 
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10  Closing  Remarks 

Various  approaches  were  taken  to  generate  the  r-polynomial  and  to  derive 
formulas  for  its  degree.  The  original  problem  was  to  find  extremal  points  in 
a  parametrized  system  of  quadratic  hermitian  forms  on  the  complex  vector 
z  for  which  the  system  has  a  nonzero  solution.  Because  hermitian  forms  are 
involved,  the  problem  is  intrinsically  one  on  the  real  vector  space  of  z  and 
'z  parameters.  Our  general  approach  was  to  enlarge  the  space,  replacing  "z 
with  a  new  vector  lu,  to  obtain  a  parametrized  system  of  bilinear  forms  on  z 
and  w.  Methods  appropriate  for  general  systems  of  bilinear  forms  were  then 
employed  to  tackle  the  problem. 

The  order  in  which  we  came  upon  the  results  in  the  various  sections  might 
be  of  interest.  The  earliest  computational  approach  we  had  was  the  families- 
of-hypersurfaces  approach  described  in  Sections  5,  6  and  7  [2].  That  method 
was  adequate  for  the  2  and  3-block  problems,  but  it  fell  short  of  finishing  off 
the  four-block  problem  (example  in  Section  8),  so  we  continued  to  look  for  an 
improved  approach.  In  the  process  we  found  (through  a  lead  from  Dave  Mor¬ 
rison)  the  hyperdeterminantal  theory  of  Section  4.  The  hyperdeterminantal 
theory  generalizes  the  families-of-hypersurfaces  approach,  giving  a  formula 
for  the  degree  of  the  hyperdeterminant  associated  with  the  r-polynomial,  but 
it  does  not  provide  a  (reasonable)  computational  procedure  for  finding  the 
roots.  After  looking  at  the  hyperdeterminants  for  a  while,  we  finally  tried 
the  successful  dialytic  approach  described  in  Sections  2  and  3.  The  abstract 
interpretation  of  Section  9  was  the  last  piece  of  the  puzzle  to  fall  into  place. 
It  is  the  geometric  idea  behind  the  dialytic  method  of  Sections  2  and  3. 

Each  part  of  the  theory  tells  something  different.  The  dialytic  approach 
seems  to  be  a  general  computational  procedure  but  it  is  not  clear  why  it 
should  work.  The  abstract  interpretation  provides  the  theoretical  basis  for 
it.  The  hyperdeterminantal  approach  is  the  most  direct  tie  with  the  classical 
theory,  while  the  families-of-hypersurfaces  approach  (as  a  special  case  of  the 
hyperdeterminantal  approach)  is  the  easiest  computational  method  for  the  2 
and  3-block  problems. 

At  this  point  the  picture  we  have  could  be  complete,  but  there  are  some 
theoretical  gaps  (next  subsection)  that  could  lead  to  surprises  in  higher  di¬ 
mensions. 
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10.1  Theoretical  Gaps 

The  examples  we  have  checked  suggest  that  we  have  been  successful,  but 
in  fact  we  still  do  not  have  a  rigorous  proof.  The  problem  is  that  the 
parametrized  system  of  bilinear  forms  (used  in  different  ways  in  the  dia- 
lytic  elimination,  the  hyperdeterminant  and  the  abstract  interpretation)  is 
not  generic  by  construction,  so  there  remains  some  doubt  that  the  theorems 
we  have  applied  hold  true  for  general  N.  To  see  the  problem,  note  that  we 
start  with  a  complex  N  x  N  matrix  {2N^  real  parameters)  from  which  we 
generate  a  system  of  N  hermitian  quadratic  forms  of  size  N  (N^  real  pa¬ 
rameters).  Strictly  speaking,  our  use  of  the  hyperdeterminantal  formula  of 
Section  4  and  the  Porteus-Thom  formula  of  Section  9  is  valid  only  for  generic 
families  of  forms.  Some  additional  conditions  must  be  checked  for  each  value 
of  N  to  see  if  any  matrices  M  of  size  N  produce  a  system  of  forms  for  which 
the  formulas  are  correct. 

To  illustrate  the  potential  problem,  let  us  consider  a  different  approach 
to  the  special  case  N  =  2.  We  have  a  parametrized  pair  of  forms  Hj{r)  on 
the  complex  2-vector  z,  let  us  work  in  the  real  four-dimensional  space  of  real 
and  imaginary  parts  of  z.  For  each  value  of  r  the  zero  set  of  the  pair  of  forms 
is  represented  geometrically  by  the  intersection  of  two  quadric  hypersurfaces 
in  (this  is  real  projective  space  now).  The  parameter  Tq  is  extremal  for 
this  family  when  the  hypersurfaces  meet  tangentially  at  some  point  on  the 
intersection  curve. 

The  condition  that  two  quadric  surfaces  should  meet  tangentially  at  some 
point  is  well  known  in  the  classic  literature.  The  procedure  for  computing 
the  associated  invariant  is  presented  in  Chapter  9,  article  202  of  [15],  For  a 
generic  pair  of  forms,  the  invariant  is  a  polynomial  of  total  degree  24  in  the 
coefficients  of  the  forms.  How  do  we  reconcile  this  result  with  the  established 
fact  that  the  degree  of  the  r-polynomial  for  TV  =  2  is  4? 

As  a  first  step,  we  might  claim  that  the  polynomial  of  degree  24  derived 
in  this  maimer  should  have  the  r-polynomial  as  a  factor.  This  first  claim 
turns  out  to  be  correct.  As  a  second  step,  we  might  claim  that  one  can 
compute  fi  by  finding  the  24  roots  and  then  selecting  from  them  the  four 
roots  associated  with  the  r-polynomial.  This  second  claim  is  incorrect.  The 
problem  is  that  for  quadrics  of  the  type  we  have  to  work  with,  the  degree- 24 
invariant  vanishes  identically.  Thus,  the  number  24  that  is  the  right  answer 
for  generic  families  of  forms  is  irrelevant  to  our  problem.  If  we  did  not  already 
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know  the  answer  to  be  4,  we  might  easily  be  misled  by  the  computation  for 
generic  data. 

From  this  simple  illustration  we  see  that  there  is  danger  in  believing 
our  formulas  for  the  degree  of  the  r-polynomial  for  general  N  without  more 
analysis.  It  would  be  nice  to  find  a  complete,  definitive  solution  to  this 
problem. 

10.2  Do  We  Really  Want  to  Find  the  r-Polynomial? 

Let  us  reconcile  the  results  obtained  with  the  stated  goals  of  Section  1.  We 
originally  said  our  goal  was  to  find  a  polynomial  p(r),  whose  largest  root  was 
/i(Af),  as  a  generalization  of  the  polynomial  in  equation  (5)  for  the  operator 
norm  of  M. 

In  fact,  what  we  have  obtained  (by  a  variety  of  methods)  are  matrices 
T(r)  of  polynomials  in  the  single  variable  r  that  are,  in  general,  full  rank  but 
become  rank  deficient  at  a  finite  set  of  r- values.  This  finite  set  of  r- values 
for  which  T(r)  drops  rank  is  the  set  of  values  we  seek. 

In  theory,  a  single  polynomial  of  the  type  specified  can  be  derived  from 
the  matrix  r(r)  -  one  finds  the  greatest  common  divisor  of  the  determinants 
of  all  the  maximal  square  submatrices  of  T'(r).  In  the  classical  literature 
this  polynomial  is  called  the  highest  invariant  factor  of  ^(r),  see  [16].  From 
a  computational  point  of  view  it  is  undesirable  to  evaluate  even  one  such 
determinant,  let  alone  find  the  greatest  common  divisor  of  a  large  collection 
of  them. 

In  fact,  the  situation  is  better  than  it  seems.  There  are  numerical  meth¬ 
ods  for  computing  the  roots  of  the  highest  invariant  factor  of  T(r)  without 
computing  the  determinantal  polynomials. 

For  example,  if  T(r)  is  a  3  X  2  matrix  of  quadratic  polynomials,  we  can 
compute  the  roots  as  follows.  First,  write 

r(r)  =  r2r2-|-rir-f-ro  (109) 

Observe  that  r(r)  drops  rank  at  ro  only  if  there  is  a  nonzero  vector  v 
such  that 


r(ro)t;  =  0  (110) 

Form  the  5x4  affine  matrix  5'(r)  defined  by: 
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(Ill) 


Tar  +  Ti  To 
-I  rl 

where  I  is  the  2x2  identity  matrix.  Note  that  the  equation 

S{rQ)w  —  0  (112) 

has  a  nonzero  solution  w  if  and  only  if  tq  is  a  root  of  the  highest  invariant 
factor  of  T(r).  So  we  have  reduced  the  problem  to  one  of  finding  the  invariant 
factors  of  an  affine  matrix  function  of  r. 

To  find  the  roots  of  the  highest  invariant  factor  of  an  affine  matrix  function 
we  proceed  as  follows: 

1.  Pick  a  maximal  square  submatrix  (4x4  for  our  example) 

2.  Solve  the  generalized  eigenvalue  problem  for  that  matrix 

3.  Pick  another  maximal  square  submatrix 

4.  Solve  the  generalized  eigenvalue  problem  for  that  matrix 

5.  Compare  the  solutions  for  the  two  problems 

6.  Iterate  as  needed 

In  theory  one  might  have  to  solve  many  generalized  eigenvalue  problems 
to  reduce  the  set  of  common  roots  to  the  set  of  desired  solutions.  In  practice, 
one  should  get  the  desired  set  of  values  after  only  a  few  trial  problems. 

In  fact,  one  can  get  away  with  solving  only  one  generalized  eigenvalue 
problem  if  the  finite  set  of  roots  obtained  can  be  checked  by  back  substitu¬ 
tion.  The  dialytic  procedure  described  in  Sections  2  and  3  can  be  checked 
in  this  way,  so  the  issue  of  not  deriving  a  single  polynomial  is  not  a  real 
inconvenience. 

Of  course,  there  is  no  denying  that  our  algorithm  produces  large  matri¬ 
ces  in  the  generalized  eigenvalue  computations  for  large  values  of  N.  The 
question  of  a  more  efficient  computation  for  the  set  of  r-vaJue  remains  open. 
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10.3  Reducibility  of  the  r-Polynomial 

As  a  final  note,  it  is  interesting  that  in  the  case  N  =  2  the  r-polynomial  is 
the  difference  between  two  squares,  hence  it  is  a  reducible  polynomial  (see 
Appendix  A). 
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The  Mu-PoIynomial  for  Two-by-Two  Matrices 
Consider  the  2  x  2  complex  matrix  M: 


M  = 


mil 

mji 


mi2 

®22 


(Al) 


We  take  as  a  definition: 


ti(M)  =  max  p(e*®  M) 


(A2) 


where  the  maximization  is  over  all  real  diagonal  2x2  matrices  0.  This  note  is  to  establish  how  the 
fimction  p(M)  can  be  computed  by  finding  the  largest  real  root  of  a  real  polynomial  in  one  variable 
with  coefficients  that  are  fimcdons  of  the  complex  matrix  M.  The  polynomial,  written  as  a  function  of 
the  variable  r,  is: 


p(r)  =  -  r**  -  (ImiiP  +  Im22l^)  ^  +  ldet(M)P  +  4  r'*  Imi2l^  Im2il^  (A14) 


The  derivation  follows. 

Define  the  pair  of  2  x  2  matrices  H*(r),  hV)  by  the  formulas: 


H‘(r)  = 


h\t)  = 


mu 

®12 

®21 

ni22 


[mil  ^12]-' 

|®21  “Zzj  “ 


r^  0 

p  0 

b  o' 

0  ^ 


(A3) 


(A4) 


Then  |i(M)  is  the  largest  real  value  of  r  such  that  there  is  a  nonzero  complex  vector  z  for  which 


j=U 


The  complex  vector  z  is  an  eigenvector  of  e‘®M  associated  with  the  eigenvalue  p(M). 
In  previous  work  it  has  been  shown  that  there  is  a  nonzero  real  vector  t°  such  that: 

[t?  H  V(M))  + 1®  HV(M))j  z  =  0 


Therefore,  if  we  consider  the  matrix  fimction  H(t,r)  defined  by: 


(A5) 


(A6) 
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H(t,r)  =  ti  H*(r)  +  tj  H^(r) 


(A7) 


we  find  that  the  matrix  H(t®,  M-CM))  is  rank  deficient,  with  the  vector  z  in  its  kernel.  Now  let  W  be  a 
unitary  transformation  whose  first  column  is  proportional  to  z.  Then; 


W*  H’OiCM))  W  = 


W*  H^OlCM))  W  = 


0  4 

ai2  x' 


0  4 

ai2  X 


(A8) 


and 


W*  H(tV(M))  W  = 


0  0 
0  x° 


(A9) 


Therefore: 


W*  H(t,^(M))  W  = 


0  0 
0  x° 


0  4 

5.1  ^1 

ai2  X 


+  (t2  -  t2) 


0  4 

5.2  v2 

ai2  X 


(AlO) 


+  (t,  -  tf) 

From  this  expression,  we  conclude  that  the  polynomial  fimction  f(t,r)  defined  by: 

f(t.r)  =  det(H(t,r))  =  -r^  Imi2p  t?  +  (A1 1) 

[(Imiil^  -  r^)(lm22l^  -  i^)  +  Im2il^lmi2l^  -  2*Re(mi2miim2im22)]  tit2  -  Im2il^  tf, 


which  reduces  to: 


f(t,r)  =  lmi2l^  t?  +  [r"*  -  (lmi,l^  +  lm22l^)  +  ldet(M)l^]  tit2  - lm2iP  t2^ 


has  the  following  properties; 


f(t°,ii(M))  =  0 

j=1.2 

atj 


(A12a) 

(A12b) 


From  the  pair  of  equations  in  (A12b)  it  is  po^ible  to  eliminate  the  unknown  parameters  if  and  arrive 
at  a  polynomial  in  the  coefficients  of  M  and  M  that  must  be  satisfied  by  p.(M).  To  this  end,  consider 
the  pair  of  polynomial  equations: 
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(A13) 


3f(t.r)  _ 

dtj  ’ 


j=  1.2 


and  eliminate  ti  and  t2.  This  elimination  is  performed  in  the  appendix.  The  final  result  is:  equation 
(A13)  can  be  solved  for  some  nonzero  vector  t°  if  and  only  if  r  is  a  root  of  the  polynomial  p(r)  defined 
by 


p(r)  =  -jy  -  (IniiiP  +  Im22l^)  ^  +  ldet(M)pj  +  4  r'*  Imi2l^  Im2il^  (A14) 

It  is  also  part  of  the  result  shown  in  the  appendix  that  if  r  is  real  then  the  nonzero  vector  t°  is  real.  By 
what  we  have  shown  so  far,  |i(M)  is  a  real  root  of  p(r).  What  we  would  like  to  show  is  the  following: 

Qaim:  The  polynomial  p(r)  has  at  least  one  noimegative  real  root,  and  the  largest  real  root  is  |i(M). 

Proof  of  Oaim:  By  the  construction  of  p(r)  we  know  that  n(M),  a  nonnegative  real  number,  is  a  root, 
so  the  first  part  of  the  claim  must  be  true  if  equation  (A14)  is  correct.  It  turns  out  that  it  is  easy  to  ver¬ 
ify  the  first  part  of  the  claim  directly,  as  follows. 

The  polynomial  p(r)  can  be  factored  as  the  difference  between  two  squares.  One  of  the  factors  is: 

Pi(r)  =  r'*  -  (Imiil^  -»•  Im22l^  +  2  Imi2llm2il)  r^  +  ldet(M)P  (A15) 

Considering  (A15)  as  a  quadratic  polynomial  in  r^,  the  discriminant  is: 

disc(M)  =  j^lmiiP+  lm22l^  +  2  Imi2llm2ilj  -4  ldet(M)P  (A16) 

which  is  itself  a  difference  between  two  squares.  One  of  the  two  factors  is  obviously  nonnegative 
(being  the  sum  of  normegative  quantities),  the  other  is: 


ImiiP  Im22p  +  2  !m,2llm2il  -  2  ldet(M)l  (A17) 

By  the  triangle  inequality,  the  expression  in  (A17)  is  greater  than  or  equal  to 

ImuP lm22l^  -  2  Imiillm22l  (A18) 

which  is  a  square,  hence  a  normegative  quantity.  It  follows  that  (A15)  has  two  real  roots  (for  i^),  both 
of  which  are  positive  because  the  coefficient  of  is  negative  and  the  constant  term  is  positive.  Then 
there  are  at  least  two  normegative  real  roots  of  p  as  a  function  of  r,  counted  with  multiplicity,  so  the 
first  part  of  the  claim  has  been  verified. 

We  now  consider  the  second  part  of  the  claim.  Sujpose  rn  is  the  largest  real  root  of  (A14).  By  the 
computation  in  the  appendix,  there  is  a  nonzero  real  vector  r  such  that 


3f(t°,  ro) 

atj 


j  =  1.2 


(A19) 


Because  f(t,r)  is  homogeneous  in  t,  we  also  have  f(t°,ro)  =  0,  so  H(t®,  ro)  is  singular.  We  can  find  a 
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unitary  matrix  V  such  that  (compare  with  equation  (AlO)): 


V*  H(t,  To)  V  = 


0  0 
0  x“ 


^12 


ai2 


+  (.h  ~ 


f  4 

ai2  ^ 


(A20) 


First  suppose  x°  is  nonzero.  Then  equation  (A19)  implies  y*  =  0  and  y^  =  0,  so  the  first  column  of  V 
is  a  vector  z  satisfying  (A5),  and  p(M)  =  Tq. 

Next,  suppose  Xq  is  0.  Then  H(t°,  Tq)  =  0,  and  if  both  t°  and  1°  are  nonzero  then  the  two  matrices 
H*(ro)  and  H^(ro)  are  proportional.  Then  any  nonzero  vector  z  satisfying 


(A21) 


will  also  satisfy  the  same  equation  with  replaced  with  H^.  Now  the  matrix  H*(r)  is  indefinite  (sig¬ 
nature  0)  for  all  real  r,  so  there  is  always  a  nonzero  vector  z  satisfying  (A21),  so  if  both  tf  and  t®  are 
nonzero  we  may  conclude  that  p.(M)  =  rg. 


Finally,  if  (say)  t°  is  zero,  then  t®  is  nonzero  and  so  H^(ro)  must  be  zero  (recall  we  are  assuming 
H(t°i  To)  =  0)  In  this  case  mai  is  zero.  It  can  be  verified  directly  from  the  definition  (A2)  that  |i(M)  is 
the  maximum  of  { Imiilflni22l}.  Also,  in  this  special  case  the  polynomial  p(r)  is: 


p(r)  =  -[r^  -  (ImnP  -h  Im22l^)  ^  +  lmiiPlm22l^]^ 


(A22) 


and  the  largest  real  root  Tq  is  the  maximum  of  {Imiil,lm22l}  which  is  |i(M).  A  symmetric  argument 
applies  if  t2  is  zero.  The  claim  is  verified. 


Computation 


We  derive  the  result  stated  between  (A  13)  and  (A14)  in  the  above  text 
Consider  equation  (A13): 


af(y)  _n 

atj  ' 


j=l,2 


The  function  f(t,r)  is  homogeneous,  quadratic  in  ti,  t2  so  equation  (A13)  can  be  rewritten: 


Q(r^)t  = 


qii(i^)  qi2(r^) 
q2i(t^)  q22(^) 


=  0 


where  (use  the  expressions  in  (All)): 


(A13) 


(A23) 
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(A24) 


qii(>^)  =  -2  Imi2l^ 

q2i(i^)  =  r'*  -  (Imnl^  +  Im22l^)  +  ldet(M)P 

qi2(r^)  =  thiW 

q22(r^)  =  -2  lm2iP 

There  is  a  nonzero  vector  t  satisfying  (A13)  if  and  only  if 

det(Q(r^))  =  qiiir^)q2zi^)  -  =  0.  (A24) 

But  it  is  easy  to  check  that: 

det(Q(r^))  =  p(r)  =  -|^r^  -  (Imul^  +  Im22l^)  ^  +  ldet(M)l^j  +  4  r'*  Imi2l^  lni2iP  (A25) 
If  r  is  real,  the  matrix  Q(r^)  is  real,  symmetric  and  the  kernel  is  real.  The  demonstration  is  complete. 
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